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ALIGNMENTS 



RESULT 1 
AAB72531 

ID AAB72531 standard; peptide; 15 AA. 
XX 

AC AAB72531; 
XX 

DT 09-MAY-2001 (first entry) 
XX 

DE Colostrinin peptide #32. 
XX 

KW Dermatological; oxidative stress regulator; colostrinin. 
XX 

OS Unidentified. 
XX 

PN WO200112650-A2. 
XX 



PD 22-FEB-2001. 
XX 

PF 17-AUG-2000; 2000WO-US022 665 . 
XX 

PR 17~AUG-1999; 99US-0149310P . 
XX 

PA (TEXA ) UNIV TEXAS SYSTEM. 
XX 

PI Stanton GJ, Hughes TK, Boldogh I; 
XX 

DR WPI; 2001-218342/22. 
XX 

PT Modulating oxidative stress level in a cell, involves contacting the cell 

PT with an oxidative stress regulator selected from colostrinin, its 

PT constituent peptide, analog or their combinations. 
XX 

PS Claim 6; Page 26; 48pp; English. 
XX 

CC The present invention relates to a method for modulating the oxidative 

CC stress level in a cell or a patient, comprising contacting the cell with, 

CC or administering to the patient, an oxidative stress regulator selected 

CC from colostrinin, or its constituent peptide (e.g. the present peptide), 

CC to change the level of an oxidising species in the cell. The method can 

CC be used to treat oxidative damage to skin, by decreasing or preventing an 

CC increase in the level of damage to a biomolecule of the patient 
XX 

SQ Sequence 15 AA; 

Query Match 100.0%; Score 89; DB 4; Length 15; 

Best Local Similarity 100.0%; Pred. No. 9.2e-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 1 MHQPPQPLPPTVMFP 15 

I I I I I I I I I I I I I I I 
Db 1 MHQPPQPLPPTVMFP 15 



RESULT 2 


AAB59334 


ID 


AAB59334 standard; peptide; 15 AA. 


XX 




AC 


AAB59334; 


XX 




DT 


21-MAR-2001 (first entry) 


XX 




DE 


Ewe colostrinin peptide fragment C-9. 


XX 




KW 


Sheep; colostrinin; proline rich polypeptide; colostrum; immune disorder; 


KW 


central nervous system disorder; dietary supplement; beta-amyloid plaque. 


XX 




OS 


Ovis sp. 


XX 




PN 


WO200075173-A2 . 


XX 




PD 


14-DEC-2000. 


XX 




PF 


02-JUN-2000; 2000WO-GB002128 . 



XX 

PR 02-JUN-1999; 99GB-00012852 . 
XX 

PA (REGE-) REGEN THERAPEUTICS PLC. 
XX 

PI Georgiades JA; 
XX 

DR WPI; 2001-071058/08. 
XX 

PT Peptides having an N-terminal amino acid sequence isolated from 

PT colostrinin for treating e.g. disorders of the central nervous system and 

PT immune system, viral and bacterial infections, and diseases characterized 

PT by amyloid plaques. 

XX 

PS Claim 7; Page 27; 63pp; English. 
XX 

CC The present invention provides the sequences of a number of peptides 

CC found in ewe's colostrinin. Colostrinin is the proline-rich polypeptide 

CC fragment of colostrum. These peptides can be used in the treatment of 

CC central nervous system disorders such as senile dementia, Parkinson's 

CC disease, Alzheimer's disease, psychosis and neurosis, immune system 

CC disorders such as bacterial and viral infections, to improve the 

CC development of a child's immune system, as a dietary supplement, and to 

CC promote the dissolution of beta-amyloid plaques 
XX 

SQ Sequence 15 AA; 

Query Match 100.0%; Score 89; DB 4; Length 15; 

Best Local Similarity 100.0%; Pred. No. 9.2e-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0 



Qy 


1 MHQPPQPLPPTVMFP 15 






1 1 M 1 1 1 1 1 1 1 1 1 1 1 




Db 


1 MHQPPQPLPPTVMFP 15 




RESULT 3 




AAB72279 




ID 


AAB72279 standard; peptide; 15 AA. 




XX 






AC 


AAB72279; 




XX 






DT 


14-MAY-2001 (first entry) 




XX 






DE 


Colostrinin derived cytokine inducing peptide 


SEQ ID 34. 


XX 






KW 


Colostrinin; immune response; cytokine; blood 


cell proliferation; 


KW 


central nervous system disorder; neurological 


diosrder; mental disorder; 


KW 


dementia; neurodegenerative disease; Alzheimer 


-'s disease; psychosis; 


KW 


neurosis; infection. 




XX 






OS 


Synthetic . 




XX 






PN 


WO200111937-A2. 




XX 






PD 


22-FEB-2001. 




XX 







PF 17-AUG-2000; 2000WO-US022818 . 
XX 

PR 17-AUG-1999; 99US-0149311P . 
XX 

PA (TEXA ) UNIV TEXAS SYSTEM. 

PA (REGE-) REGEN THERAPEUTICS PLC. 

XX 

PI Stanton GJ, Hughes TK, Boldogh I, Georgiades J; 
XX 

DR WPI; 2001-202804/20. 
XX 

PT Inducing a cytokine and modulating an immune response, useful for 

PT treating central nervous system diseases and bacterial and viral 

PT infections, comprises administering colostrinin as an immunological 

PT regulator. 
XX 

PS Claim 1; Page 34; 50pp; English. 
XX 

CC Sequences AAB72246 - AAB72275 represent peptides derived from clostrinin, 

CC a proline rich polypeptide aggregate contained in colostrum. The peptides 

CC have immune response modulatory activity, and are capable of inducing 

CC cytokines. Colostrinin and its derived peptides are useful for inducing 

CC cytokine production, for modulating an immunological response and for 

CC inducing blood cell proliferation. The peptides are useful in the 

CC treatment of disorders of the central nervous system, neurological 

CC disorders, mental disorders, dementia, neurodegenerative diseases, 

CC Alzheimer's disease, motor neurone disease, psychosis, neurosis, chronic 

CC disorders of the immune system, bacterial and viral infections and 

CC acquired immunological deficiencies 

XX 

SQ Sequence 15 AA; 

Query Match 100.0%; Score 89; DB 4; Length 15; 
Best Local Similarity 100.0%; Pred. No. 9.2e-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0 



Qy 


1 MHQPPQPLPPTVMFP 15 




1 1 1 1 1 1 1 1 M 1 1 1 1 1 


Db 


1 MHQPPQPLPPTVMFP 15 


RESULT 4 


AAB72563 


ID 


AAB72563 standard; peptide; 15 AA. 


XX 




AC 


AAB72563; 


XX 




DT 


09-MAY-2001 (first entry) 


XX 




DE 


Colostrinin peptide #32. 


XX 




KW 


Neuroprotective; neural cell differentiation regulator; colostrinin; 


KW 


colostrum. 


XX 




OS 


Unidentified. 


XX 




PN 


WO200112651-A2. 



XX 

PD 22-FEB-2001. 
XX 

PF 17-AUG-2000; 2000WO-US022774 . 
XX 

PR 17-AUG-1999; 99US-0149633P . 
XX 

PA (TEXA ) UNIV TEXAS SYSTEM. 
XX 

PI Boldogh I; 
XX 

DR WPI; 2001-226545/23. 
XX 

PT Use of colostrinin, its constituent peptide or analog as a neural cell 

PT regulator, for promoting neural cell differentiation and treating damaged 

PT neural cells in a patient. 
XX 

PS Claim 6; Page 22; 35pp; English. 
XX 

CC The present invention relates to a method for promoting neural cell 

CC differentiation and treating damaged neural cells, using colostrinin and 

CC colostrinin constituent peptides (e.g. the present peptide) as a neural 

CC cell regulator. Colostrinin is a polypeptide complex found in colostrum 

XX 

SQ Sequence 15 AA; 

Query Match 100.0%; Score 89; DB 4; Length 15; 
Best Local Similarity 100.0%; Pred. No. 9.2e-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 1 MHQPPQPLPPTVMFP 15 

I I I I I I I I M I I I I I 

Db 1 MHQPPQPLPPTVMFP 15 



RESULT 5 
AAO14610 

ID AAO14610 standard; peptide; 15 AA, 
XX 

AC AAO14610; 
XX 

DT 27-MAY-2002 (first entry) 
XX 

DE Neural cell regulatory colostrinin peptide 32. 
XX 

KW Neural cell differentiation; neural cell regulator; colostrinin peptide; 
KW neural cell formation; proline-rich polypeptide aggregate; colostrum; 
KW neural cell treatment. 
XX 

OS Unidentified. 
XX 

FH Key Location/Qualifiers 
FT Modif ied-site 15 

FT /note= "Optional C-terminal amide" 

XX 

PN WO200213851-A1. 
XX 



PD 21-FEB-2002. 
XX 

PF 17-AUG-2000; 2000WO-US022777 . 
XX 

PR 17-AUG-2000; 2000WO-US022777 . 
XX 

PA (TEXA ) UNIV TEXAS SYSTEM. 
XX 

PI Boldogh I, Stanton JG, Hughes TK; 
XX 

DR WPI; 2002-269152/31. 
XX 

PT Promoting cell differentiation in a patient involves use of blood cell 

PT regulator selected from colostrinin, its constituent peptide and/or 

PT analog. 
XX 

PS Claim 7; Page 22; 37pp; English. 
XX 

CC The invention comprises a method for promoting cell differentiation (e.g. 

CC neural cell differentiation) . The method involves contacting cells with a 

CC neural cell regulator (i.e. a colostrinin peptide) in order to change the 

CC cells in morphology to form neural cells. Colostrinin is a proline-rich 

CC polypeptide aggregate that is present in colostrum. The method of the 

CC invention is useful for promoting the differentiation of cells and for 

CC treating damaged neural cells in a patient. The present amino acid 

CC sequence represents a specifically claimed colostrinin peptide used in 

CC the method of the invention 
XX 

SQ Sequence 15 AA; 

Query Match 100.0%; Score 89; DB 5; Length 15; 
Best Local Similarity 100.0%; Pred. No. 9.2e-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0 



Qy 


1 MHQPPQPLPPTVMFP 15 






1 1 1 1 1 1 M 1 1 1 1 1 1 1 




Db 


1 MHQPPQPLPPTVMFP 15 




RESULT 6 




AAM51066 




ID 


7\AM51066 standard; peptide; 15 AA, 




XX 






AC 


AAM51066; 




XX 






DT 


30-MAY-2002 (first entry) 




XX 






DE 


Colostrinin constituent peptide (casein 


amino acids 159-173 


XX 






KW 


Colostrinin; colostrum; immunomodulator ; 


cardiovascular; 


KW 


blood cell regulator; cytokine inducer; 


beta-casein; human. 


XX 






OS 


Homo sapiens. 




XX 






FH 


Key Location/Qualifiers 




FT 


Modif ied-site 15 




FT 


/note= "optional C-terminal amidation" 



XX 

PN WO200213849-A1. 
XX 

PD 21-FEB-2002. 
XX 

PF 17-AUG-2000; 2 OOOWO-US 022775 . 
XX 

PR 17-AUG-2000; 2 000WO-US022775 . 
XX 

PA (TEXA ) UNIV TEXAS SYSTEM. 

PA (REGE-) REGEN THERAPEUTICS PLC. 

XX 

PI Stanton GJ, Hughes TK, Boldogh I, Georgiades J; 
XX 

DR WPI; 2002-269150/31, 
XX 

PT Modulation of blood cell proliferation in a patient involves use of blood 

PT cell regulator selected from colostrinin, its constituent peptide and/or 

PT analog. 
XX 

PS Claim 1; Page 34; 54pp; English. 
XX 

CC The present sequence is that of a colostrinin constituent peptide that is 

CC used as an immunological regulator and as a blood cell regulator in 

CC claimed methods of the invention. It is classified as having a beta- 

CC casein homologue precursor^ and corresponds to casein amino acids 159- 

CC 173. Methods are claimed for: inducing a cytokine in a cell by contact 

CC with an immunological regulator, where the cell is present in a cell 

CC culture, a tissue, an organ or an organism, and the cell is mammalian, 

CC including human; modulating an immune response in a cell by contact with 

CC the immunological regulator under conditions effective to induce a 

CC cytokine; modulating an immune response in a patient by administering an 

CC immunological regulator under conditions effective to induce a cytokine, 

CC where the immunological regulator is administered topically or as part of 

CC a dietary supplement, and where the immune response is specific or non 

CC specific, an interferon response or an antibody response; modulating 

CC blood cell proliferation by contacting blood cells with a blood cell 

CC regulator, where the blood cells are present in a cell culture or an 

CC organism, are mammalian or human, and where the blood cells are increased 

CC in number or differentiated; and a method for modulating blood cell 

CC proliferation in a patent. A claimed cytokine-inducing composition 

CC comprises a pharmaceutical carrier and an active agent such as the 

CC present peptide 

XX 

SQ Sequence 15 AA^ 

Query Match 100.0%; Score 89; DB 5; Length 15; 
Best Local Similarity 100.0%; Pred. No. 9.2e-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMFP 15. 

I I I I I I I I I I I I I I I 

Db 1 MHQPPQPLPPTVMFP 15 



RESULT 7 
7iAE20261 



ID AAE20261 standard; peptide; 15 AA, 
XX 

AC AAE20261; 
XX 

DT 18-JUN-2002 (first entry) 
XX 

DE Colostrinin constituent peptide #32 . 
XX 

KW Blood cell regulator; colostrinin; constituent peptide; oxidative stress; 

KW therapy; oxidative damage; skin; aging; wound healing; cell replacement; 

KW tissue; organ; cosmetic procedure; repair; regeneration; preservation; 

KW transplantation; implantation; dermatological ; vulnerary. 
XX 

OS Unidentified. 
XX 

FH Key Location/Qualifiers 

FT Modif ied-site 15 

FT /note= "Optionally C-terminal amide" 

XX 

PN WO200213850-A1. 
XX 

PD 21-FEB-2002. 
XX 

PF 17-AUG-2000; 2000WO-US022776 . 
XX 

PR 17-AUG-2000; 2000WO-US022776 . 
XX 

PA (TEXA ) UNIV TEXAS SYSTEM. 
XX 

PI Stanton GJ, Hughes TK, Boldogh I; 
XX 

DR WPI; 2002-269151/31. 
XX 

PT Composition useful for the modulation of blood cell proliferation in a 

PT patient comprises a blood cell regulator selected from colostrinin, its 

PT constituent peptide and/or analog. 
XX 

PS Claim 6; Page 26; 51pp; English. 
XX 

CC The invention relates to a composition which comprises a blood cell 

CC regulator selected from colostrinin, its constituent peptide and/or 

CC analogue. The invention is used for modulating the oxidative stress level 

CC in a cell e.g. mammalian or human cell present in a cell culture, tissue, 

CC organ, or organism; or for treating oxidative damage to the skin of a 

CC patient e.g. animal or human; to modulate oxidative stress during/ after 

CC a premature birth or normal birth, preventing/delaying aging in a 

CC patient, enhancing wound healing, and the reduction of side effects of 

CC cosmetic procedures. The method changes the level of an oxidising species 

CC in the cell, such as decreases or prevents increase in the level of 

CC damage to a biomolecule of the patient selected from DNA, protein and/or 

CC lipid, compared to the same conditions when the oxidative stress 

CC regulator is not present. The modulation of oxidative stress results in 

CC enhanced repair, regeneration, and replacement of cells, tissues and 

CC organs (e.g. kidney, liver, pancreas, skin, and the other internal and 

CC external organs), as well as enhanced preservation of such organs for 

CC transplantation, implantation, or scientific research. The present 

CC sequence is a colostrinin constituent peptide 



XX 

SQ Sequence 15 AA; 



Query Match 100,0%; Score 89; DB 5; Length 15; 

Best Local Similarity 100.0%; Pred. No. 9.2e-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 MHQPPQPLPPTVMFP 15 

I I I M I I I I I I I I I I 

Db 1 MHQPPQPLPPTVMFP 15 



RESULT 8 
AAW00679 

ID AAW00679 standard; protein; 222 AA. 
XX 

AC AAW00679; 
XX 

DT 22-APR-1997 (first entry) 
XX 

DE Beta-casein, 
XX 

KW Beta-casein; goat; transgenic antithrombin III; factor XI; factor X; 

KW tAT3; serine protease inhibitor; thrombin; factor VII; factor IX; plasma; 

KW factor XII; mammary gland specific; hereditary ATS deficiency; therapy; 

KW acquired AT3 deficiency; heparin affinity. 

XX 

OS Capra hircus. 
XX 

PN W09626268-A1. 
XX 

PD 29-AUG-1996. 
XX 

PF 21-FEB-1996; 96WO-US002420 . 
XX 

PR 21-FEB-1995; 95US-00391743 . 
XX 

PA (GENZ ) GENZYME TRANSGENICS CORP. 
XX 

PI Ditullio P, Meade H, Cole ES; 
XX 

DR WPI; 1996-402361/40, 

DR N-PSDB; 7^T59829, AAT59830, 7^AT59831, AAT59832, AAT59833, AAT59834, 

DR AAT59835. 

XX 

PT New transgenically produced antithrombin III - useful for treating 

PT acquired or inherited AT3 deficiency, with faster clearance rate than 

PT plasma AT3. 
XX 

PS Disclosure; Fig 10; 37pp; English. 
XX 

CC This sequence represents the goat beta-casein. The gene encoding this 

CC protein was used to produce the transgenic antithrombin III (tAT3) of the 

CC invention. AT3 is a serine protease inhibitor, which inhibits thrombin 

CC and the activated forms of factors X, VII, IX, XI, and XII. The 

CC transgenic tAT3 of the invention includes a monosaccharide composition 

CC containing N-acetylgalactosamine (GalNAc) . To produce a mammary gland 



CC specific transgene, human AT3 cDNA (cloned as an 18.5 kb fragment) was 

CC inserted into the goat beta-casein gene. The final 14.95 kb vector was 

CC microinjected into goat embryos, which when developed produced the tAT3 

CC in their milk. The tAT3 lacks 0-linked glycosylation, and the major 

CC glycoform has a complex oligosaccharide at each glycosylation site, 

CC except the Asnl55 residue. At Asnl55, it has a significant amount of 

CC oligomannose, and hybrid residues. The tATS is used in the same way as 

CC plasma ATS, for the treatment of hereditary or acquired ATS deficiency. 

CC The tATS, however, has a faster clearance rate than plasma ATS and may 

CC also have a higher affinity for heparin. The tATS can also be produced 

CC without the variability, immunogenicity and viral contamination problems 

CC associated with plasma-derived material 
XX 

SQ Sequence 222 AA; 

Query Match 91.0%; Score 81; DB 2; Length 222; 

Best Local Similarity 93. S%; Pred. No. 0.01; 

Matches 14; Conservative 0; Mismatches 1; Indels 0; Gaps 

Qy 1 MHQPPQPLPPTVMFP 15 

I I I I I I I I I I I I M 
Db 159 MHQPPQPLSPTVMFP 173 



AAR80281; 

14-FEB-1996 (first entry) 

Methyl or ethyl esterified bovine beta-casein Al . 

Bovine; beta-casein; ethyl esterif ication; pepsin hydrolysis; 
proteolysis; peptide ester; food; pharmaceutical; cosmetics. 



RESULT 9 
AAR80281 

ID AAR80281 standard; protein; 209 AA. 
XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
XX 
OS 
XX 
FH 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 



Bos taurus , 
Key 

Protein 



Peptide 

Cleavage-site 
Cleavage-site 

Cleavage-site 

Cleavage-site 



Location/Qualifiers 

1. .209 

/note= "55% esterified by methanol or by ethanol, 
resulting in atypical pepsin cleavage sites, in addition 
to the naturally occurring (native) sites" 

2. .25 
/label= A 

/note= "tryptic peptide from native protein" 

4. .5 

/note= "pepsin cleavage site in native protein" 

5. .6 

/note= "pepsin cleavage site in native protein and in 
methyl ester of beta-casein" 
11. .12 

/note= "newly identified pepsin cleavage site in methyl 
ester of beta-casein" 
15. .16 

/note= "pepsin cleavage site in native protein" 



FT 


Moai riea- 


site 


1 ^ 








FT 






/notie— pnospnoryia ueci 








FT 


Moairiea- 


site 


1 f 








T7>rn 

r 1 














FT 


Modif ied- 


site 










FT 






/noue— pnospnoryiai-eci 








r i 


Modi f ied- 


site 


1 








FT 






/note= "phosphorylated" 








FT 


Peptide 




-it), , Z o 








FT 






/ laDei— D 








FT 






/note= "tryptic peptide firorri 


IldUlvt; p I. (J L. trXli 
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XX 
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XX 
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XX 

PA (INRG ) INST NAT RECH AGRONOMIQUE. 
XX 

PI Chobert J, Briand L, Haertle T; 
XX 

DR WPI; 1995-240679/31. 
XX 

PT New esterified amino acids, peptide (s) and their mixts . - prepd. by 

PT esterif ication of protein then enzymatic hydrolysis, useful as 

PT ingredients and additives in foods, pharmaceuticals and cosmetics. 

XX 

PS Claim 7; Fig 7 and 18; 47pp; French. 
XX 

CC The native form of bovine beta-casein Al contains various pepsin cleavage 

CC sites. Esterif ication of the protein with methanol or ethanol results in 

CC a form of beta-casein contg. additional, non-conventional pepsin cleavage 

CC sites (see Features Table) . Esterified peptides and amino acids (and 

CC their mixtures) resulting from hydrolysis of an esterified protein (pref. 

CC beta-lactoglobulin or beta-casein) are claimed. The hydrolysis products 

CC are useful as ingredients, additives or active agents in foods, 

CC pharmaceuticals and cosmetics 

XX 
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Selecting non-diabetogenic milk and milk prods. - by testing milk or cows 
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for the presence of non-diabetogenic variants of beta-casein. 
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Disclosure; Fig 2; 28pp; English. 
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A method for selecing milk for feeding to diabetes suscetible individuals 
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comprises testing milk from identified cows for the presence of variants 
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diabetogenic variants and milking these cows separately. The milk and 
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PT New casein and its preparation - has a higher mineral solubilising 

PT effect. 

XX 

PS Claim 1; Page 23-24; 32pp; English. 
XX 

CC This sequence represents the modified casein protein of the invention. It 

CC is based on the bovine casein sequence and has the following 

CC substitutions in comparison with the conventional A2 variant of beta- 

CC casein: R25C, L88I, Q117E, E175Q and Q195E. The new casein has a higher 

CC mineral solubilising effect and is therefore more effective than previous 

CC caseins 
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PT products fortified with betaine, cobalamin, folic acid or pyridoxine. 
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CC The invention relates to a dietary supplement which, when consumed, is 

CC capable of reducing plasma levels of homo cyst ( e ) ine (tHcy). tHcy is a 

CC major risk indicator of heart disease and vascular disease in general in 

CC humans. Vascular wall health is also seriously compromised in patients 

CC with clinical or unrecognised diabetes, with tHcy being a strong risk 

CC factor for mortality in type II diabetic patients. Deficiencies in folic 

CC acid, pyridoxine and cobalamin lead to higher tHcy levels, and folic acid 

CC deficiency is known to be involved in vascular disease, as well as 

CC causing neural tube defects in early embryonic development. The dietary 

CC supplement of the invention comprises milk or a milk product, fortified 

CC by the addition of betaine, cobalamin, folic acid, pyridoxine or their 

CC analogues. In addition, the beta-casein component of the milk is 

CC substantially the A2 variant. Beta-casein types Al and B, consumption of 

CC which are correlated with the incidence of type I diabetes, are 

CC substantially excluded from the supplement. The dietary supplement is 

CC useful for reducing the incidence of vascular disease, including 

CC peripheral vascular disease and blood vessel wall degeneration and 

CC particularly cardiovascular disease and cerebrovascular disease, and is 

CC also useful for reducing the incidence of type I and II diabetes. It 



CC additionally provides a sufficient daily dose of folic acid to prevent 

CC neural tube defects in foetuses. The supplement provides health 

CC improvements to a human population without the administration of 

CC medication. The present sequence represents bovine beta-casein type A2 
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Modulating oxidative stress level in a cell, involves contacting the cell 


PT 


with an oxidative stress regulator selected from colostrinin, its 
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constituent peptide, analog or their combinations. 
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The present invention relates to a method for modulating the oxidative 
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stress level in a cell or a patient, comprising contacting the cell with. 
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or administering to the patient, an oxidative stress regulator selected 
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from colostrinin, or its constituent peptide (e.g. the present peptide). 
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to change the level of an oxidising species in the cell. The method can 
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be used to treat oxidative damage to skin, by decreasing or preventing an 
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increase in the level of damage to a biomolecule of the patient 
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Inducing a cytokine and modulating an immune response, useful for 


PT 


treating central nervous system diseases and bacterial and viral 


PT 


infections, comprises administering colostrinin as an immunological 


PT 


regulator . 
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Claim 1; Page 34; 50pp; English* 
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Sequences AAB72246 - AAB72275 represent peptides derived from clostrinin, 


cc 


a proline rich polypeptide aggregate contained in colostrum. The peptides 
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have immune response modulatory activity, and are capable of inducing 


cc 


cytokines. Colostrinin and its derived peptides are useful for inducing 


cc 


cytokine production, for modulating an immunological response and for 


cc 


inducing blood cell proliferation. The peptides are useful in the 



CC treatment of disorders of the central nervous system, neurological 

CC disorders, mental disorders, dementia, neurodegenerative diseases, 

CC Alzheimer's disease, motor neurone disease, psychosis, neurosis, chronic 

CC disorders of the immune system, bacterial and viral infections and 

CC acquired immunological deficiencies 
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Use of colostrinin, its constituent peptide or analog as a neural cell 
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regulator, for promoting neural cell differentiation and treating damaged 


PT 


neural cells in a patient. 
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Claim 6; Page 21; 35pp; English. 
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The present invention relates to a method for promoting neural cell 
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differentiation and treating damaged neural cells, using colostrinin and 
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cell regulator. Colostrinin is a polypeptide complex found in colostrum 


XX 





SQ Sequence 10 AA; 



Query Match 64,0%; Score 57; DB 4; Length 10; 

Best Local Similarity 100.0%; Pred. No. 0.66; 

Matches 10; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 6 QPLPPTVMFP 15 

I I I I I I I I I I 
Db 1 QPLPPTVMFP 10 



Search completed: August 24, 2004, 15:42:49 
Job time : 63.1194 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



August 24, 2004, 15:33:13 ; Search time 16.4552 Seconds 

(without alignments) 
47.060 Million cell updates/sec 

US-09-641-801-34 
89 

1 MHQPPQPLPPTVMFP 15 
BLOSUM62 

Gapop 10.0 , Gapext 0.5 



389414 



Searched: 389414 seqs, 51625971 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued_Patents_AA: * 

1 : /cgn2_6/ptodata/2/iaa/5A_COMB .pep : * 

2 : /cgn2_6/ptodata/2/iaa/5B_COMB.pep : * 

3: /cgn2_6/ptodata/2/iaa/6A__COMB.pep: ^ 

4 : /cgn2_6/ptodata/2/iaa/6B_COMB.pep: * 

5 : /cgn2_6/ptodata/2/iaa/PCTUS_COMB . pep : * 

6 : /cgn2_6/ptodata/2/iaa/backf ilesl .pep : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result 
No. 


Score 


Query 
Match 


Length 


DB 


ID 


Description 


1 


89 


100 


0 


15 


4 


US-09-641-803-34 


Sequence 


34, Appl 


2 


81 


91 


0 


222 


2 


US-08-391-743A-2 


Sequence 


2, Appli 


3 


81 


91 


0 


222 


4 


US-09-143-155-2 


Sequence 


2, Appli 


4 


80 


89 


9 


209 


3 


US-09-269-220-1 


Sequence 


1, Appli 


5 


80 


89 


9 


209 


3 


US-09-269-220-2 


Sequence 


2, Appli 


6 


80 


89 


9 


209 


4 


US-08-836-778-2 


Sequence 


2, Appli 


7 


57 


64 


0 


10 


4 


US-09-641-803-25 


Sequence 


25, Appl 


8 


53 


59 


6 


10 


4 


US-09-794-346-1 


Sequence 


1, Appli 


9 


50 


56 


2 


751 


4 


US-10-020-079-8 


Sequence 


8, Appli 


10 


50 


56 


2 


764 


4 


US-10-020-079-6 


Sequence 


6, Appli 


11 


50 


56 


2 


776 


4 


US-10-020-079-24 


Sequence 


24, Appl 



12 


50 


56 


2 


789 


4 


US- 


10- 


020- 


079-22 


Sequence 


22, Appl 


13 


50 


56 


2 


838 


4 


US- 


10- 


020- 


079-40 


Sequence 


40, Appl 


14 


50 


56 


2 


851 


4 


US- 


10- 


020- 


079-38 


Sequence 


38, Appl 


15 


50 


56 


2 


863 


4 


US- 


10- 


020- 


079-32 


Sequence 


32, Appl 


16 


50 


56 


2 


864 


4 


US- 


10- 


020- 


079-4 


Sequence 


4, Appli 


17 


50 


56 


2 


870 


4 


us- 


10- 


020- 


079-2 


Sequence 


2, Appli 


18 


50 


56 


2 


876 


4 


us- 


10- 


020- 


079-30 


Sequence 


30, Appl 


19 


50 


56 


2 


889 


4 


us- 


10- 


020- 


079-20 


Sequence 


20, Appl 


20 


50 


56 


2 


895 


4 


us- 


10- 


020- 


079-18 


Sequence 


18, Appl 


21 


50 


56 


2 


951 


4 


us- 


10- 


020- 


079-36 


Sequence 


36, Appl 


22 


50 


56 


2 


957 


4 


us- 


10- 


020- 


079-34 


Sequence 


34, Appl 


23 


50 


56 


2 


976 


4 


us- 


10- 


020- 


079-28 


Sequence 


28, Appl 


24 


50 


56 


2 


982 


4 


us- 


10- 


020- 


079-26 


Sequence 


26, Appl 


25 


47 


52 


8 


188 


4 


us- 


09- 


252- 


991A-28564 


Sequence 


28564, A 


26 


47 


52 


8 


210 


1 


us- 


08- 


078- 


090-2 


Sequence 


2, Appli 


27 


47 


52 


8 


213 


3 


us- 


09- 


131- 


028A-2 


Sequence 


2, Appli 


28 


47 


52 


8 


213 


3 


us- 


09- 


131- 


028A-12 


Sequence 


12, Appl 


29 


47 


52 


8 


385 


1 


us- 


08- 


450- 


257-58 


Sequence 


58, Appl 


30 


47 


52 


8 


385 


1 


us- 


08- 


450- 


246-58 


Sequence 


58, Appl 


31 


47 


52 


8 


385 


1 


us- 


08- 


450- 


098-58 


Sequence 


58, Appl 


32 


47 


52 


8 


385 


1 


us- 


08- 


451- 


233-58 


Sequence 


58, Appl 


33 


47 


52 


8 


385 


1 


us- 


08- 


450- 


236-58 


Sequence 


58, Appl 


34 


47 


52 


8 


385 


4 


us- 


08- 


235- 


403-58 


Sequence 


58, Appl 


35 


47 


52 


8 


490 


3 


us- 


09- 


039- 


555B-14 


Sequence 


14, Appl 


36 


46 


51 


7 


154 


4 


us- 


09- 


252- 


991A-30960 


Sequence 


30960, A 


37 


46 


51 


7 


480 


4 


us- 


09- 


149- 


476-405 


Sequence 


405, App 


38 


46 


51 


7 


708 


4 


us- 


09- 


857- 


556A-12 


Sequence 


12, Appl 


39 


45 


50 


6 


97 


4 


us- 


09- 


673- 


395A-388 


Sequence 


38 8, App 


40 


45 


50 


6 


173 


4 


us- 


09- 


252- 


991A-23800 


Sequence 


23800, A 


41 


45 


50 


6 


278 


4 


us- 


09- 


252- 


991A-32988 


Sequence 


32988, A 


42 


45 


50 


6 


2414 


1 


us- 


08- 


227- 


536-2 


Sequence 


2, Appli 


43 


45 


50 


6 


2414 


5 


PCT 


-US95-04682-2 


Sequence 


2, Appli 


44 


44 


49 


4 


95 


4 


us- 


09- 


314- 


268-132 


Sequence 


132, App 


45 


43.5 


48 


9 


255 


4 


us- 


09- 


489- 


039A-9101 


Sequence 


9101, Ap 



ALIGNMENTS 



RESULT 1 

US-09-641-803-34 

; Sequence 34, Application US/09641803 

; Patent No. 6500798 

; GENERAL INFORMATION: 

; APPLICANT: STANTON, G. John 

; APPLICANT: HUGHES, Thomas K. 

; APPLICANT: BOLDOGH, Istvan 

; TITLE OF INVENTION: USE OF COLOSTRININ, CONSTITUENT PEPTIDES THEREOF, AND 

; TITLE OF INVENTION: ANALOGS THEREOF AS OXIDATIVE STRESS REGULATORS 

; FILE REFERENCE: 265.00220101 

; CURRENT APPLICATION NUMBER: US/ 09/ 64 1 , 8 03 

; CURRENT FILING DATE: 2000-08-17 

; PRIOR APPLICATION NUMBER: 60/149,310 

; PRIOR FILING DATE: 1999-08-17 

; NUMBER OF SEQ ID NOS : 34 

; SOFTWARE: Patentin Ver. 2.1 

; SEQ ID NO 34 



LENGTH: 15 
; TYPE: PRT 

; ORGANISM: Artificial Sequence 
; FEATURE : 

OTHER INFORMATION: Description of Artificial Sequence: synthetic 
OTHER INFORMATION: peptide 
US-09-641-803-34 

Query Match 100.0%; Score 89; DB 4; Length 15; 

Best Local Similarity 100.0%; Pred. No. 3.2e-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMFP 15 

I I I I M I I I I I I I I I 
Db 1 MHQPPQPLPPTVMFP 15 



RESULT 2 

US-08-391-743A-2 

; Sequence 2, Application US/08391743A 

; Patent No. 5843705 

; GENERAL INFORMATION: 

; APPLICANT: DiTullio, Paul A.; Meade, Harry; Cole, Edward S. 

; TITLE OF INVENTION: TRANS GENETICALLY PRODUCED ANTITHROMBIN III 

NUMBER OF SEQUENCES: 2 
; CORRESPONDENCE ADDRESS: 

ADDRESSEE: LAHIVE & COCKFIELD, LLP 
STREET: 28 State Street 
CITY: Boston 
; STATE: Massachusetts 

COUNTRY: USA 
ZIP: 02109 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentin Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 0 8/391 , 743A 
FILING DATE: 21-FEB-1995 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 
APPLICATION NUMBER: 
FILING DATE: 
ATTORNEY/AGENT INFORMATION: 
; NAME: Myers, Paul Louis 

REGISTRATION NUMBER: 35,965 
REFERENCE/DOCKET NUMBER: TCI-045 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (617)22 7-74 00 
TELEFAX: (617)7 42-4214 
; INFORMATION FOR SEQ ID NO: 2: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 222 amino acids 
; TYPE: amino acid 

; TOPOLOGY: linear 

MOLECULE TYPE: protein 



US-08-391-743A-2 



Query Match 91.0%; Score 81; DB 2; Length 222; 

Best Local Similarity 93.3%; Pred. No, 0.0036; 

Matches 14; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMFP 15 

I I I I I I I I I I I I I I 
Db 159 MHQPPQPLSPTVMFP 173 



RESULT 3 
US-09-143-155-2 

; Sequence 2, Application US/09143155 
; Patent No. 6441145 

GENERAL INFORMATION: 
; APPLICANT: DiTullio, Paul A.; Meade, Harry; Cole, Edward S. 

TITLE OF INVENTION: TRANS GENETICALLY PRODUCED ANTITHROMBIN III 
NUMBER OF SEQUENCES: 2 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: LAHIVE & COCKFIELD, LLP 
STREET: 28 State Street 
CITY: Boston 
; STATE: Massachusetts 

; COUNTRY: USA 

ZIP: 02109 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
; OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentin Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 0 9/ 14 3 , 155 
FILING DATE: 28-Aug-1998 
; CLASSIFICATION: <Unknown> 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/391,743 
FILING DATE: <Unknown> 
ATTORNEY/AGENT INFORMATION: 
; NAME: Myers, Paul Louis 

; REGISTRATION NUMBER: 35,965 

REFERENCE/ DOCKET NUMBER: TCI-045 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (617)227-7400 
; TELEFAX: (617)742-4214 

; INFORMATION FOR SEQ ID NO: 2: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 222 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
US-09-143-155-2 

Query Match 91.0%; Score 81; DB 4; Length 222; 

Best Local Similarity 93.3%; Pred. No. 0.0036; 

Matches 14; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 1 MHQPPQPLPPTVMFP 15 

I I I I I I I I M I I I I 
Db 159 MHQPPQPLSPTVMFP 173 



RESULT 4 
US-09-269-220-1 

; Sequence 1, Application US/09269220 

; Patent No. 6180761 

; GENERAL INFORMATION: 

; APPLICANT: HAN, Sang K 

; APPLICANT: SHIN, Yoo C 

; TITLE OF INVENTION: CASEIN AND PROCESS FOR THE PREPARATION THEREOF 

; FILE REFERENCE: 142 3 . 1001/MJH 

; CURRENT APPLICATION NUMBER: US/ 0 9/2 69 , 22 0 

; CURRENT FILING DATE: 1999-03-23 

PRIOR APPLICATION NUMBER: KR 1996-43482 
; PRIOR FILING DATE: 1996-03-23 
; PRIOR APPLICATION NUMBER: PCT/KR97 / 00182 
; PRIOR FILING DATE: 1997-09-23 
; NUMBER OF SEQ ID NOS : 7 
; SOFTWARE: Patent In Ver . 2.1 
; SEQ ID NO 1 

LENGTH: 2 09 

TYPE: PRT 
; ORGANISM: Bos taurus 

FEATURE: 

NAME/ KEY: ACT_SITE 
LOCATION: (15) 

OTHER INFORMATION: phospholyated serine 
NAME/KEY: ACT_SITE 
LOCATION: (17) . . (19) 
US-09-269-220-1 

Query Match 89.9%; Score 80; DB 3; Length 209; 

Best Local Similarity 93.3%; Pred. No. 0.0046; 

Matches 14; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMFP 15 

I I I I I I I I I I I I I I 
Db 144 MHQPHQPLPPTVMFP 158 



RESULT 5 
US-09-269-220-2 

; Sequence 2, Application US/09269220 

; Patent No. 6180761 

; GENERAL INFORMATION: 

; APPLICANT: HAN, Sang K 

; APPLICANT: SHIN, Yoo C 

; TITLE OF INVENTION: CASEIN AND PROCESS FOR THE PREPARATION THEREOF 

; FILE REFERENCE: 142 3 . 1 001/MJH 

; CURRENT APPLICATION NUMBER: US/09/269,220 

; CURRENT FILING DATE: 1999-03-23 

; PRIOR APPLICATION NUMBER: KR 1996-43482 

; PRIOR FILING DATE: 1996-03-23 



; PRIOR APPLICATION NUMBER: PCT/KR97/00182 

; PRIOR FILING DATE: 1997-09-23 

; NUMBER OF SEQ ID NOS : 7 

; SOFTWARE: Patentin Ver. 2.1 

; SEQ ID NO 2 

LENGTH: 2 09 
; TYPE: PRT 
; ORGANISM: Bos taurus 
US-09-269-220-2 

Query Match 89.9%; Score 80; DB 3; Length 209; 

Best Local Similarity 93.3%; Pred. No. 0.0046; 

Matches 14; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 
Qy 1 MHQPPQPLPPTVMFP 15 

I I I I I I I I I M I I I 

Db 144 MHQPHQPLPPTVMFP 158 



RESULT 6 
US-08-836-778-2 

; Sequence 2, Application US/08836778 
; Patent No. 6451368 
; GENERAL INFORMATION: 

; APPLICANT: ELLIOTT, ROBERT BARTLETT 
; APPLICANT: HILL, JEREMY PAUL 

; TITLE OF INVENTION: METHOD OF SELECTING NON-DIABETOGENIC MILK OR MILK 
TITLE OF INVENTION: PRODUCTS AND MILK OR MILK PRODUCTS SO SELECTED 
FILE REFERENCE: P369648 DCC 
; CURRENT APPLICATION NUMBER: US/08/836, 778 
; CURRENT FILING DATE: 1995-11-03 
; PRIOR APPLICATION NUMBER: NZ 264862 
; PRIOR FILING DATE: 1994-11-04 
; NUMBER OF SEQ ID NOS: 2 
; SOFTWARE: Patentin Ver. 2.1 
; SEQ ID NO 2 

LENGTH: 209 

TYPE: PRT 
; ORGANISM: Artificial Sequence 
; FEATURE : 

; OTHER INFORMATION: Description of Artificial Sequence : BOVINE MILK 

; OTHER INFORMATION: PROTEIN 

US-08-836-778-2 

Query Match 89.9%; Score 80; DB 4; Length 209; 

Best Local Similarity 93,3%; Pred. No. 0.0046; 

Matches 14; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMFP 15 

I I I I I I I M I I I I I 
Db 144 MHQPHQPLPPTVMFP 158 



RESULT 7 

US-09-641-803-25 

; Sequence 25, Application US/09641803 
; Patent No. 6500798 



; GENERAL INFORMATION: 
; APPLICANT: STANTON, G. John 
; APPLICANT: HUGHES, Thomas K. 
; APPLICANT: BOLDOGH, Istvan 

; TITLE OF INVENTION: USE OF COLOSTRININ, CONSTITUENT PEPTIDES THEREOF, AND 

; TITLE OF INVENTION: ANALOGS THEREOF AS OXIDATIVE STRESS REGULATORS 

; FILE REFERENCE: 265.00220101 

; CURRENT APPLICATION NUMBER: US/09/641 , 803 

; CURRENT FILING DATE: 2000-08-17 

; PRIOR APPLICATION NUMBER: 60/149,310 

; PRIOR FILING DATE: 1999-08-17 

; NUMBER OF SEQ ID NOS : 34 

; SOFTWARE: PatentlnVer. 2.1 

; SEQ ID NO 25 

; LENGTH: 10 

; TYPE: PRT 

; ORGANISM: Artificial Sequence 
; FEATURE : 

; OTHER INFORMATION: Description of Artificial Sequence: synthetic 

; OTHER INFORMATION: peptide 

US-09-641-803-25 

Query Match 64.0%; Score 57; DB 4; Length 10; 

Best Local Similarity 100.0%; Pred. No. 0.18; 

Matches 10; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 6 QPLPPTVMFP 15 

I I I I I I I I I I 
Db 1 QPLPPTVMFP 10 



RESULT 8 
US-09-794-346-1 

; Sequence 1, Application US/09794346 
; Patent No. 6627604 
; GENERAL INFORMATION: 

; APPLIC7\NT: Aventis Pharma Deutschland GmbH 

; TITLE OF INVENTION: Memno Peptides, Process for Their Preparation and Use 
Thereof 

; FILE REFERENCE: 02481.1728 

; CURRENT APPLICATION NUMBER: US/ 09/ 7 94 , 34 6 
; CURRENT FILING DATE: 2001-02-28 
; PRIOR APPLICATION NUMBER: EP 00104114.4 
; PRIOR FILING DATE: 2000-02-29 

PRIOR APPLICATION NUMBER: PCT/EP 01/01661 
; PRIOR FILING DATE: 2001-02-15 
; NUMBER OF SEQ ID NOS: 1 

SOFTWARE: Patentin version 3,0 
; SEQ ID NO 1 
; LENGTH: 10 
; TYPE: PRT 

0RG7\NISM: artificial sequence 
FEATURE : 

NAME/KEY: misc_feature 
; OTHER INFORMATION: Description of Artificial Sequence: Memnoniella 
echinata, FH 227 

OTHER INFORMATION: 1, DSM 1319 



US-09-794-346-1 



Query Match 59.6%; Score 53; DB 4; Length 10; 

Best Local Similarity 90.0%; Pred. No. 0.55; 

Matches 9; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPP 10 

I I I I I I I M 

Db 1 MHQPHQPLPP 10 



RESULT 9 
US-10-020-079-8 

; Sequence 8, Application US/10020079 
; Patent No. 6579710 
; GENERAL INFORMATION: 

; APPLICANT: Turner, C. Alexander Jr. 

; APPLICANT: Mathur, Brian 

; APPLICANT: Friddle, Carl Johan 

; TITLE OF INVENTION: No. 6579710el Human Kinases and Polynucleotides Encoding 
the Same 

; FILE REFERENCE: LEX-0281-USA 

; CURRENT APPLICATION NUMBER: US/ 10/ 02 0 , 07 9 

; CURRENT FILING DATE: 2001-12-12 

; PRIOR APPLICATION NUMBER: US 60/255,103 

; PRIOR FILING DATE: 2000-12-12 

; PRIOR APPLICATION NUMBER: US 60/289,422 

; PRIOR FILING DATE: 2001-05-08 

; NUMBER OF SEQ ID NOS : 4 0 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 8 

LENGTH: 751 

TYPE: PRT 
; ORGANISM: homo sapiens 

FEATURE: 

NAME/KEY: VARIANT 
LOCATION: (1) . . . (751) 

OTHER INFORMATION: Xaa - Any Amino Acid 
US-10-020-079-8 

Query Match 56.2%; Score 50; DB 4; Length 751; 

Best Local Similarity 88.9%; Pred. No. 70; 

Matches 8; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 HQPPQPLPP 10 

I I I I I I I I 
Db 597 HLPPQPLPP 605 



RESULT 10 
US-10-020-079-6 

; Sequence 6, Application US/10020079 
; Patent No. 6579710 
; GENERAL INFORMATION: 

; APPLICANT: Turner, C. Alexander Jr. 

; APPLICANT: Mathur, Brian 

; APPLICANT: Friddle, Carl Johan 



TITLE OF INVENTION: No. 6579710el Human Kinases and Polynucleotides Encoding 
the Same 

FILE REFERENCE: LEX-0281-USA 
CURRENT APPLICATION NUMBER: US/ 10/ 02 0 , 07 9 
CURRENT FILING DATE: 2001-12-12 
PRIOR APPLICATION NUMBER: US 60/255,103 
PRIOR FILING DATE: 2000-12-12 
PRIOR APPLICATION NUMBER: US 60/289,422 
PRIOR FILING DATE: 2001-05-08 
NUMBER OF SEQ ID NOS: 40 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 6 
LENGTH: 7 64 
TYPE: PRT 

ORGANISM: homo sapiens 
FEATURE: 

NAME /KEY: VARI7\NT 
LOCATION: (1) . . . (764) 

OTHER INFORMATION: Xaa = Any Amino Acid 
US-10-020-079-6 

Query Match 56.2%; Score 50; DB 4; Length 764; 

Best Local Similarity 88.9%; Pred. No. 71; 

Matches 8; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 HQPPQPLPP 10 

I I I I I I I I 
Db 610 HLPPQPLPP 618 



RESULT 11 
US-lQ-020-079-24 

; Sequence 24, Application US/10020079 
; Patent No. 6579710 
; GENERAL INFORMATION: 

; APPLICANT: Turner, C. Alexander Jr. 

; APPLICANT: Mathur, Brian 

; APPLICANT: Friddle, Carl Johan 

TITLE OF INVENTION: No. 6579710el Human Kinases and Polynucleotides Encoding 
the Same 

; FILE REFERENCE: LEX-0281-USA 

; CURRENT APPLICATION NUMBER: US/10/020,079 

CURRENT FILING DATE: 2001-12-12 

PRIOR APPLICATION NUMBER: US 60/255,103 
; PRIOR FILING DATE: 2000-12-12 

PRIOR APPLICATION NUMBER: US 60/289,422 
; PRIOR FILING DATE: 2001-05-08 
; NUMBER OF SEQ ID NOS: 40 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 24 
LENGTH: 77 6 
TYPE: PRT 

ORGANISM: homo sapiens 
FEATURE : 

NAME/KEY: VARIANT 
LOCATION: (1) . . . (776) 

OTHER INFORMATION: Xaa = Any Amino Acid 



US-10-020-079-24 



Query Match 56.2%; Score 50; DB 4; Length 776; 

Best Local Similarity 88.9%; Pred. No. 72; 

Matches 8; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 HQPPQPLPP 10 

I I I I I M I 
Db 597 HLPPQPLPP 605 



RESULT 12 
US-10-020-079-22 

Sequence 22, Application US/10020079 
Patent No. 6579710 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 



Turner, C. Alexander Jr. 
Mathur, Brian 
Friddle, Carl Johan 

TITLE OF INVENTION: No. 6579710el Human Kinases and Polynucleotides Encoding 
the Same 

FILE REFERENCE: LEX-0281-USA 
CURRENT APPLICATION NUMBER: US/ 1 0/ 02 0 , 07 9 
CURRENT FILING DATE: 2001-12-12 
PRIOR APPLICATION NUMBER: US 60/255,103 
PRIOR FILING DATE: 2000-12-12 
PRIOR APPLICATION NUMBER: US 60/2 8 9,422 
PRIOR FILING DATE: 2001-05-08 
NUMBER OF SEQ ID NOS : 4 0 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 22 
LENGTH: 789 
TYPE: PRT 

ORGT^ISM: homo sapiens 
FEATURE: 

NAME/ KEY: VARIANT 
LOCATION: (1) ... (789) 

OTHER INFORMATION: Xaa = Any Amino Acid 
US-10-020-079-22 

Query Match 56.2%; Score 50; DB 4; Length 789; 

Best Local Similarity 88.9%; Pred. No, 73; 

Matches 8; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 HQPPQPLPP 10 

I I I I I I M 

Db 610 HLPPQPLPP 618 



RESULT 13 
US-10-020-079-40 

; Sequence 40, Application US/10020079 
; Patent No. 6579710 
; GENERAL INFORMATION: 

; APPLICANT: Turner, C. Alexander Jr. 

; APPLICANT: Mathur, Brian 

; APPLICANT: Friddle, Carl Johan 



TITLE OF INVENTION: No. 6579710el Human Kinases and Polynucleotides Encoding 
the Same 

FILE REFERENCE: LEX-02 81-USA 
CURRENT APPLICATION NUMBER: US/ 10/ 02 0 , 07 9 
CURRENT FILING DATE: 2001-12-12 
PRIOR APPLICATION NUMBER: US 60/255,103 
PRIOR FILING DATE: 2000-12-12 
PRIOR APPLICATION NUMBER: US 60/289,422 
PRIOR FILING DATE: 2001-05-08 
NUMBER OF SEQ ID NOS : 4 0 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 40 
LENGTH: 838 
TYPE: PRT 

ORGANISM: homo sapiens 
FEATURE: 

NAME/ KEY: VARIANT 
LOCATION: (1) . . . (838) 

OTHER INFORMATION: Xaa = Any Amino Acid 
US-10-020-079-40 

Query Match 56.2%; Score 50; DB 4; Length 838; 

Best Local Similarity 88.9%; Pred. No. 77; 

Matches 8; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 HQPPQPLPP 10 

I I I I I I I I 
Db 684 HLPPQPLPP 692 



RESULT 14 
US-10-020-079-38 

; Sequence 38, Application US/10020079 
; Patent No. 6579710 
; GENERAL INFORMATION: 

; APPLICANT: Turner, C. Alexander Jr. 

; APPLICANT: Mathur, Brian 

; APPLICANT: Friddle, Carl Johan 

; TITLE OF INVENTION; No. 6579710el Human Kinases and Polynucleotides Encoding 
the Same 

; FILE REFERENCE: LEX-0281-USA 

; CURRENT APPLICATION NUMBER: US/10/020,079 

CURRENT FILING DATE: 2001-12-12 

PRIOR APPLICATION NUMBER: US 60/255,103 

PRIOR FILING DATE: 2000-12-12 
; PRIOR APPLICATION NUMBER: US 60/289,422 

PRIOR FILING DATE: 2001-05-08 
; NUMBER OF SEQ ID NOS: 4 0 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 38 
; LENGTH: 851 
TYPE: PRT 

ORGANISM: homo sapiens 
; FEATURE: 

NAME/ KEY: VARIANT 

LOCATION: (1) . . . (851) 
; OTHER INFORMATION: Xaa = Any Amino Acid 



US-10-020-079-38 



Query Match 56.2%; Score 50; DB 4; Length 851; 

Best Local Similarity 88.9%; Pred. No. 78; 

Matches 8; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 HQPPQPLPP 10 

I I I I I I I I 
Db 697 HLPPQPLPP 705 



RESULT 15 
US-10-020-079-32 

; Sequence 32, Application US/10020079 
; Patent No. 6579710 
; GENERAL INFORMATION: 

; APPLICANT: Turner, C. Alexander Jr. 

; APPLICANT: Mathur, Brian 

; APPLICANT: Friddle, Carl Johan 

; TITLE OF INVENTION: No. 6579710el Human Kinases and Polynucleotides Encoding 
the Same 

; FILE REFERENCE: LEX-0281-USA 

; CURRENT APPLICATION NUMBER: US/10/020, 079 

; CURRENT FILING DATE: 2001-12-12 

; PRIOR APPLICATION NUMBER: US 60/255,103 

; PRIOR FILING DATE: 2000-12-12 

; PRIOR APPLICATION NUMBER: US 60/289,422 

; PRIOR FILING DATE: 2001-05-08 

; NUMBER OF SEQ ID NOS : 40 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 32 
; LENGTH: 8 63 
TYPE : PRT 

ORGANISM: homo sapiens 
FEATURE : 

NAME/KEY: VARIANT 

LOCATION: (1) . . . (863) 
; OTHER INFORMATION: Xaa = Any Amino Acid 
US~10-020-079-32 

Query Match 56.2%; Score 50; DB 4; Length 863; 

Best Local Similarity 88.9%; Pred. No, 79; 

Matches 8; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 HQPPQPLPP 10 

I I I I M I I 
Db 684 HLPPQPLPP 692 



Search completed: August 24, 2004, 15:55:22 
Job time : 17.4552 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2 004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table : 



August 24, 2004, 15:26:28 ; Search time 14.5522 Seconds 

(without alignments) 
99.151 Million cell updates/sec 

US-09-641-801-34 
89 

1 MHQPPQPLPPTVMFP 15 
BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 283366 seqs, 96191526 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



283366 



Database 



PIR_78: * 
pirl : * 
pir2 : ^ 
pir3 : * 
pir4 : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result 



Query 



No. 


Score 


Match 


Length 


DB 


ID 


1 


89 


100 


0 
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ALIGNMENTS 



RESULT 1 
A32979 

beta-casein precursor - sheep 

C; Species: Ovis orientalis aries, Ovis ammon aries (domestic sheep) 

C;Date: 12-Oct-1989 #sequence_revision 31-Dec-1993 #text_change 13-Aug-1999 

C;Accession: A32979; A29173 

R;Provot, C; Persuy, M.A. ; Mercier, J.C. 

Biochimie 71, 827-832, 1989 

A; Title: Complete nucleotide sequence of ovine beta-casein cDNA: inter-species 
comparison . 

A; Reference number: A32979; MUID : 89375530; PMID:2505862 
A;Accession: A32979 
A; Status : preliminary 
A;Molecule type: mRNA 
A; Residues: 1-222 <PRO> 

A;Cross-references : GB:X16482; NID:gl210; PIDN : CAA34502 . 1 ; PID:gl211 
A; Note: the authors translated the codon CAC for residue 160 as Lys 
R; Richardson, B.C.; Mercier, J.C. 
Eur. J. Biochem. 99, 285-297, 1979 

A; Title: The primary structure of the ovine beta-caseins. 



A; Reference number: A29173; MUID : 8 004 6695 ; PMID:499202 
A;Accession: A29173 
A;Molecule type: protein 

A; Residues: 16-69, 'T^ 71-77, 'P^ 79-81, 'A' , 83-222 <RIC> 

C; Superf amily : beta-casein 

C; Keywords: milk; phosphoprotein 

F; 1-15/Domain : signal sequence #status predicted <SIG> 
F; 16-222/Product : beta-casein #status experimental <MAT> 

F; 30, 32, 33, 34, 50/Binding site: phosphate (Ser) (covalent) (by casein kinase II) 
#status predicted 

Query Match 100.0%; Score 89; DB 2; Length 222; 

Best Local Similarity 100.0%; Pred. No. 7e-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMFP 15 

I M I I I I I I I I I I M 
Db 159 MHQPPQPLPPTVMFP 173 



RESULT 2 
JC1384 

beta-casein precursor - goat 

C; Species: Capra aegagrus hircus (domestic goat) 

C;Date: lO-Jun-1993 #sequence_revision lO-Jun-1993 #text_change 23-Feb-1997 
C; Accession: JC138 4 

R; Roberts, B.; DiTullio, P.; Vitale, J.; Hehir, K. ; Gordon, K. 
Gene 121, 255-262, 1992 

A;Title: Cloning of the goat beta-casein-encoding gene and expression in 
transgenic mice. 

A; Reference number: JC1384; MUID: 93077039; PMID: 1446822 

A; Accession: JC1384 

A; Molecule type: DNA 

A; Residues: 1-222 <ROB> 

A; Cross-references : GB:M90556 

C; Genetics : 

A; Gene: CSN2 

A;Introns: 17/3; 26/3; 35/3; 45/3; 57/3; 221/3 

C; Superf amily : beta-casein 

C; Keywords: milk; phosphoprotein 

Query Match 91.0%; Score 81; DB 2; Length 222; 

Best Local Similarity 93,3%; Pred. No. 0.00079; 

Matches 14; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMFP 15 

I I I I I I I I I I I I I I 
Db 159 MHQPPQPLSPTVMFP 173 



RESULT 3 
A59068 

beta-casein variant CnH - bovine 

C; Species: Bos primigenius taurus (cattle) 

C;Date: 20-Sep-1999 #sequence_revision 24-Sep-1999 #text_change 24-Sep-1999 
C;Accession: A59068; B59068 
R;Han, S.K.; Shin, Y.C. 



Anim. Genet. 27(Suppl.2), 91b, 1996 

A;Title: Biochemical characterization of the new beta-casein variant in Korean 
cattle. 

A; Reference number: A59068 
A; Accession : A59068 

A; Status: protein sequence not shown 
A;Molecule type: protein 
A; Residues: 1-209 <HAN1> 

A; Experimental source: strain Korean cattle 

A;Note: submitted to the Protein Sequence Database, September 1999 
A;Note: includes casein phosphopeptide H 
A; Access ion : B59068 

A; Status: protein sequence not shown 
A;Molecule type: protein 
A; Residues: 1-28 <HAN2> 

A; Experimental source: strain Korean cattle 

C; Superf amily : beta-casein 

C; Keywords: milk; phosphoprotein 

F; 15, 17, 18, 19/Binding site: phosphate (Ser) (covalent) #status predicted 

Query Match 89.9%; Score 80; DB 2; Length 209; 

Best Local Similarity 93.3%; Pred. No. 0.001; 

Matches 14; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 1 MHQPPQPLPPTVMFP 15 

I I I I I I I I I I I I I I 
Db 14 4 MHQPHQPLPPTVMFP 158 



RESULT 4 
KBB0A2 

beta-casein precursor - bovine 

C; Species: Bos primigenius taurus (cattle) 

C;Date: 24-Apr-1984 #sequence_revision 12-May-1995 #text_change ll-May-2000 
C;Accession: 145873; B29087; S01860; A25846; S02429; A90489; A91191; B91192; 
C91192; D91192; A90739; 146963; A91413; A03110 
R;Bonsing, J.; Ring, J.M.; Stewart, A.F.; Mackinlay, A.G. 
Aust. J. Biol. Sci. 41, 527-537, 1988 

A; Title: Complete nucleotide sequence of the bovine beta-casein gene. 
A;Reference number: 145873; MUID: 90147279; PMID:3271384 
A; Accession : 145873 

A; Status: translated from GB/EMBL/DDBJ 

A;Molecule type: DNA 

A;Residues: 1- 81 H 83-224 <BON> 

A;Cross-references : GB:M55158; NID:gl62804; PIDN : AAA30431 . 1 ; PID:gl62805 
R;Stewart, A.F.; Bonsing, J.; Beattie, C.W. ; Shah, F. ; Willis, I.M.; Mackinlay, 
A.G. 

Mol. Biol. Evol. 4, 231-241, 1987 

A;Title: Complete nucleotide sequences of bovine alpha-s2- and beta-casein 

cDNAs: comparisons with related sequences in other species. 

A; Reference number: A93062; MUID: 88188989 ; PMID:2833669 

A; Accession: B29087 

A; Status: translation not shown 

A; Molecule type: mRNA 

A; Residues: 1-224 <STE> 

A;Cross-references : GB:M16645; NID:gl62930; PIDN : AAA30480 . 1 ; PID:gl62931 
A; Experimental source: 7V2 variant 



R;Baev, A.A. ; Smirnov, I.K,; Gorodetskii, S.I. 
Mol. Biol. 21, 214-222, 1987 

A;Title: Primary structure of bovine beta-casein cDNA. 

A;Reference number: S01860 

A; Accession: SO 18 60 

A;]yiolecule type: mRNA 

A; Residues: 1-81 H 83-224 <BAE> 

A; Cross-references : EMBL:X06359; NID:gl71; PIDN : CAA29658 . 1 ; PID:g757752 
A; Experimental source: Al variant 

A;Note: this paper is a translation of the Russian paper published in Mol. Biol. 
Moscow (1987) 21: 255-265 

R; Jimenez-Flores, R. ; Kang, Y.C.; Richardson, T. 
Biochem. Biophys . Res. Commun. 142, 617-621, 1987 

A; Title: Cloning and sequence analysis of bovine beta-casein cDNA. 
A;Reference number: A25846; MUID : 87128158 ; PMID:3814153 
A; Accession: A25846 
A;Molecule type: mRNA 

A; Residues: 1-107, 10 9-151, ' PL 154-209 , 'Q', 2 11-22 4 <JIM> 
A;Cross-references: GB:M15132; NID:gl62796; PIDN : AAA30430 . 1 ; PID:gl62797 
R; Carles, C; Huet, J.C.; Ribadeau-Dumas , B. 
FEBS Lett. 229, 265-272, 1988 

A;Title: A new strategy for primary structure determination of proteins; 
application to bovine beta-casein. 

A;Reference number: S02429; MUID : 8 8152252 ; PMID:3278933 

A; Accession: S02429 

A;Molecule type: protein 

A; Residues: 16-81, 'H 83-224 <CAR> 

A; Experimental source: Al variant 

R;Yan, S.B.; Wold, F. 

Biochemistry 23, 3759-3765, 1984 

A;Title: Neoglycoproteins : in vitro introduction of glycosyl units at 

glutaminesin beta-casein using transglutaminase. 

A; Reference number: A90489; MUID: 85000478 ; PMID: 6148101 

A;Accession: A904 8 9 

A;Molecule type: protein 

A; Residues: 16-224 <YAN> 

R; Ribadeau-Dumas, B. ; Brignon, G. ; Grosclaude, F.; Mercier, J.C. 
Eur. J. Biochem. 25, 505-514, 1972 

A; Title: Structure primaire de la caseine beta bovine. 
A;Reference number: A91191; MUID : 72233212 ; PMID:4557764 
A;Accession: A91191 
A;Molecule type : protein 

A; Residues: 16-131, 'Q', 133- 151, ' PL ', 154-18 9, 'E', 191-209, 'Q', 2 11-224 <RIB> 
A; Experimental source: A2 variant 

A;Note: article in French with an English abstract 
R;Grosclaude, F. ; Mahe, M.F.; Mercier, J.C; Ribadeau-Dumas, B. 
Eur. J. Biochem. 26, 328-337, 1972 

A;Title: Caracterisation des variants genetiques des caseines alpha-Sl et beta 
bovines. 

A;Reference number: A91192; MUID : 722 14259 ; PMID:5064450 
A;Note: article in French with an English abstract 
A; Access ion: B91192 
A;Molecule type: protein 

A; Residues: 16-81, »H', 83-131, 'Q', 133-151, ' PL 154-189, 'E', 191-2 09, *Q', 21 1-224 
<VA1> 

A; Experimental source: Al variant 
A; Accession: C91192 



A; Molecule type; protein 

A; Residues: 16-81, 'H', 83-131, 'Q', 133-136, 'R', 138-151, ' PL 154-189 , 'EM91- 

209, 'Q* ,211-224 <VAB> 

A; Experimental source: B variant 

A; Access ion: D91192 

A;Molecule type: protein 

A; Residues: 16-51, 'K', 53-81, »H', 83-131, »Q', 133-151, ' PL ' , 154-18 9 , 'E',191- 

209, 'Q^211-224 <VAC> 

A; Experimental source: C variant 

A; Note: this variant lacks a phosphate group on 50-Ser 

R; Ribadeau-Dumas, B.; Grosclaude, F. ; Mercier, J.C. 

C. R. Acad. Sci. Hebd. Seances Acad. Sci. D 270, 2369-2372, 1970 

A;Title: Localisation dans la chaine peptidique de la caseine beta bovine de la 

substitution His/Gin dif f erenciant les variants genetiques A2 et A3. 

A; Reference number: A90739; MUID: 71252171; PMID : 4997 616 

A; Note: article in French with an English abstract 

A; Access ion: A90739 

A;Molecule type: protein 

A;Residues: 118-120, * Q ^ 122-124 <VA3> 

A; Experimental source: A3 variant 

R; Simons, G. ; van den Heuvel, W. ; Reynen, T . ; Frijters, A.; Rutten, G.; Slangen, 
C.J,; Groenen, M. ; de Vos, W.M.; Siezen, R.J. 
Protein Eng. 6, 763-770, 1993 

A; Title: Overproduction of bovine beta-casein in Escherichia coli and 
engineering of its main chymosin cleavage site. 
A;Reference number: 146963; MUID: 94068382 ; PMID: 8248100 
A;Accession: 146963 

A; Status : translated from GB/EMBL/DDBJ 

A;Molecule type: mRNA 

A;Residues: 1-120, ' Q ', 122-224 <SIM> 

A; Cross-references: GB:S67277; NID:g459291; PIDN : AAB29137 . 1 ; PID:g459292 
A; Experimental source: A3 variant 
R;Grosclaude, F.; Mahe, M.F.; Voglino, G.F. 
FEES Lett. 45, 3-5, 1974 

A; Title: Le variant beta-E et le code de phosphorylation des caseines bovines . 

A;Reference number: A91413; MUID: 75005247 ; PMID:4411121 

A;Note: article in French with an English abstract 

A; Accession: A91413 

A;Molecule type: protein 

A;Residues: 48-50, * K* , 52-63 <VAE> 

A; Experimental source: E variant 

A;Note: 50-Ser is phosphorylated 

C; Comment: The sequence shown is the 7V2 variant. 
C; Genetics : 

A;Introns: 17/3; 26/3; 35/3; 43/3; 57/3; 223/3 

C; Superf amily : beta-casein 

C; Keywords: milk; phosphoprotein 

F; 1-15/Domain: signal sequence #status predicted <SIG> 
F; 16-224/Product : beta-casein #status experimental <MAT> 

F;30, 32, 33, 34/Binding site: phosphate (Ser) (covalent) (by casein kinase II) 
#status experimental 

F;50/Binding site: phosphate (Ser) (covalent) (by casein kinase II) (partial) 
#status experimental 



Query Match 89.9%; Score 80; DB 1; Length 224; 

Best Local Similarity 93.3%; Pred. No. 0.0011; 

Matches 14; Conservative 0; Mismatches 1; Indels 0; Gaps 



0; 



Qy 1 MHQPPQPLPPTVMFP 15 

I I I I I I M I I M I I 
Db 159 MHQPHQPLPPTVMFP 173 



RESULT 5 
A48384 

beta-casein - pig 

C; Species: Sus scrofa domestica (domestic pig) 

C;Date: 19-Nov-1993 #sequence_revision 18-Nov-1994 #text_change 03-May-1996 
C; Accession: A4 8 384 
R;Alexander, L.J.; Beattie, C.W. 
Anim. Genet. 23, 369-371, 1992 

A;Title: The sequence of porcine beta-casein cDNA. 

A; Reference number: A48384; MUID : 92367961 ; PMID: 1503277 

A; Accession: A48 384 

A; Status: preliminary 

A;Molecule type: nucleic acid 

A; Residues: 1-232 <ALE> 

A; Experimental source: mammary gland 

A; Note: sequence inconsistent with the nucleotide translation 

A;Note: sequence extracted from NCBI backbone (NCBIN : 1108 91, NCBIP : 110895) 

C; Superf amily : beta-casein 

Query Match 61.8%; Score 55; DB 2; Length 232; 

Best Local Similarity 71.4%; Pred. No. 2.2; 

Matches 10; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMF 14 

III 111:1 I II 
Db 158 MHQIPQPVPQTPMF 171 



RESULT 6 
T03455 

ALR protein - human 

C; Species: Homo sapiens (man) 

C;Date: 24-Mar-1999 #sequence_revision 24-Mar-1999 #text_change 27-Oct-2003 
C; Accession: T03455 

R; Prasad, R. ; Zhadanov, A.B.; Sedkov, Y. ; Bullrich, F. ; Druck, T.; Rallapalli, 
R.; Yano, T.; Alder, H.; Croce, CM.; Huebner, K.; Mazo, A.; Canaani, E. 
Oncogene 15, 549-560, 1997 

A; Title: Structure and expression pattern of human ALR, a novel gene with strong 
homology to ALL-1 involved in acute leukemia, and to Drosophila trithorax. 
A; Reference number: Z14954; MUID: 97388474 ; PMID: 9247308 
A; Access ion: T03455 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-4957 <PRA> 

A; Cross-references: EMBL : AF010404 ; NID : g2358286; PIDN : AAC51735 . 1 ; PID:g2358287 

C; Genetics : 

A; Gene: ALR 

A;Map position: 12 

C; Superfamily : acute lymphoblastic leukemia protein, ALR type 
C; Keywords: alternative splicing 



Query Match 59.6%; Score 53; DB 2; Length 4957; 

Best Local Similarity 57.1%; Pred. No. 86; 

Matches 8; Conservative 3; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMF 14 

: I : I I : I II I I 
Db 188 6 LHKPPRPQPPEVAF 18 99 



RESULT 7 
T03454 

7VLR protein - human 

C; Species: Homo sapiens (man) 

C;Date: 24-Mar-1999 #sequence_revision 24-Mar-1999 #text_change 27-Oct-2003 
C;Accession: T03454 

R;Prasad, R. ; Zhadanov, A.B.; Sedkov, Y. ; Bullrich, F. ; Druck, T.; Rallapalli, 
R.; Yano, T.; Alder, H. ; Croce, CM,; Huebner, K. ; Mazo, A.; Canaani, E. 
Oncogene 15, 549-560, 1997 

A; Title: Structure and expression pattern of human ALR, a novel gene with strong 
homology to ALL-1 involved in acute leukemia, and to Drosophila trithorax. 
A; Reference number: Z14954; MUID : 97388474 ; PMID: 9247308 
A; Access ion: TO 34 5 4 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-5262 <PRA> 

A;Cross-references: EMBL: AF010403; NID: g2358284; PIDN : AAC51734 . 1 ; PID:g2358285 

C; Genetics : 

A; Gene: ALR 

A;Map position: 12 

C; Superf amily : acute lymphoblastic leukemia protein, ALR type 
C; Keywords: alternative splicing 

Query Match 59.6%; Score 53; DB 2; Length 5262; 

Best Local Similarity 57.1%; Pred. No. 91; 

Matches 8; Conservative 3; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMF 14 

: I : I I : I II I I 
Db 2191 LHKPPRPQPPEVAF 2204 



RESULT 8 
A60305 

beta-casein - Arabian camel (fragment) 

C; Species: Camelus dromedarius (Arabian camel) 

C;Date: lO-Nov-1992 #sequence_revision lO-Nov-1992 #text_change 30-Sep-1993 
C;Accession: A60305 

R;Beg, O.U.; von Bahr-Lindstroem, H. ; Zaidi, Z.H.; Joernvall, H. 
Regul. Pept. 15, 55-62, 1986 

A; Title: Characterization of a camel milk protein rich in proline identifies a 
new beta-casein fragment. 

A;Reference number: A60305; MUID: 87017861; PMID:3763959 
A;Accession: A60305 
A;Molecule type: protein 
A; Residues: 1-64 <BEG> 

A;Note: this fragment appears to form by a non-tryptic cleavage of beta-casein 
in camel milk 



C; Superf amily : beta-casein 



Query Match 56.2%; Score 50; DB 2; Length 64; 

Best Local Similarity 60.0%; Pred. No. 2.8; 

Matches 9; Conservative 2; Mismatches 4; Indels 0; Gaps 

Qy 1 MHQPPQPLPPTVMFP 15 

1:1 I I I : I I I I 
Db 1 MYQIPQPVPQTPMIP 15 



RESULT 9 
T23989 

hypothetical protein R07A4.3 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 18-Feb-2000 
C; Accession: T23989 
R; Cottage, A. 

submitted to the EMBL Data Library, November 1995 
A;Reference number; Z19827 
A;Accession: T23989 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-295 <WIL> 

A;Cross-references : EMBL:Z67756; PIDN : CAA91763 . 1 ; GSPDB : GN0002 8 ; CESP:R07A4. 

A; Experimental source: clone R07A4 

C; Genetics : 

A; Gene: CESP:R07A4.3 

A;Map position : X 

A;Introns: 29/3; 53/1; 142/1; 168/1; 229/1; 253/1 

Query Match 56.2%; Score 50; DB 2; Length 295; 

Best Local Similarity 75.0%; Pred. No. 13; 

Matches 9; Conservative 1; Mismatches 2; Indels 0; Gaps 

Qy 4 PPQPLPPTVMFP 15 

I I I I I III: I 
Db 102 PPQPLKPTVIRP 113 



RESULT 10 
T26908 

hypothetical protein Y45F10A.1 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 04-Mar-2000 
C; Accession: T2 6908 
R;McMurray^ A. 

submitted to the EMBL Data Library, January 1998 
A;Reference number: Z20285 
A; Access ion: T2 6908 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A; Residues: 1-292 <WIL> 

A; Cross-references : EMBL : AL0214 8 8 ; PIDN : CAA16365 . 1 ; GSPDB : GNO 0022 ; 
CESP:Y45F10A.l 

A; Experimental source: clone Y45F10A 
C;Genetics : 



A;Gene: CESP : Y45F10A. 1 
A;Map position: 4 
A;Introns: 228/2; 261/3 

C;Superfamily: Caenorhabditis elegans hypothetical protein Y45F10A.1 

Query Match 55.1%; Score 49; DB 2; Length 292; 

Best Local Similarity 60.0%; Pred. No. 17; 

Matches 9; Conservative 0; Mismatches 6; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMFP 15 

III I I I II I 
Db 1 MHSPNHPLPPTSNSP 15 



RESULT 11 
T01123 

hypothetical protein At2g32840 [imported] - Arabidopsis thaliana 
N;Alternate names: hypothetical protein F24L7.2; T21L14.22 
C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 12-Feb-1999 #sequence_revision 12-Feb-1999 #text_change 16-Feb-2001 
C;Accession: T01123; T00784; B84738 

R;Rounsley, S.D.; Lin, X.; Ketchum, K.A. ; Crosby, M.L.; Brandon, R.C.; Sykes, 
S.M,; Kaul, S.; Mason, T.M.; Kerlavage, A.R.; Adams, M.D.; Somerville, C.R.; 
Venter, J.C. 

submitted to the EMBL Data Library, December 1997 

A; Description: Arabidopsis thaliana chromosome II BAG T21L14 genomic sequence. 
A;Reference number: Z14209 
A; Access ion: TGI 123 

A; Status: translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A; Residues: 1-34 6 <ROU> 

A;Cross-references : EMBL : AC003033 ; NID : g27022 61 ; PID:g2702278 
A; Experimental source; cultivar Columbia 

R;Rounsley, S.D.; Lin, X.; Ketchum, K.A. ; Crosby, M.L.; Brandon, R.C.; Sykes, 
S.M.; Kaul, S.; Mason, T.M.; Kerlavage, A.R.; Adams, M.D.; Somerville, C.R.; 
Venter, J.C. 

submitted to the EMBL Data Library, February 1998 

A; Description: Arabidopsis thaliana chromosome II BAG F24L7 genomic sequence. 
A;Reference number: Z14204 
A; Access ion: TO 07 84 

A; Status: translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-346 <ROW> 

A;Cross-references : EMBL : AC003974 ; NID : g2914688 ; PID:g2914690 
A; Experimental source: cultivar Columbia 

R;Lin, X.; Kaul, S.; Rounsley, S.D.; Shea, T.P.; Benito, M.I.; Town, CD.; 
Fujii, C.Y.; Mason, T.M.; Bowman, C.L.; Barnstead, M.E.; Feldblyum, T.V.; Buell, 
C.R.; Ketchum, K.A.; Lee, J. J.; Ronning, CM.; Koo, H.; Moffat, K.S.; Cronin, 
L.A.; Shen, M. ; VanAken, S.E.; Umayam, L.; Tallon, L.J.; Gill, J.E.; Adams, 
M.D.; Carrera, A. J.; Creasy, T.H,; Goodman, H.M.; Somerville, CR.; Copenhaver, 
CP.; Preuss, D.; Nierman, W.C; White, O. ; Eisen, J.A. ; Salzberg, S.L.; Eraser, 
CM. ; Venter, J.C 
Nature 402, 761-768, 1999 

A;Title: Sequence and analysis of chromosome 2 of the plant Arabidopsis 
thaliana . 

A;Reference number: A84420; MUID : 200834 87 ; PMID : 10617197 
A; Accession: B84738 



A; status: preliminary 
A;Molecule type: DNA 
A; Residues: 1-346 <STO> 

A; Cross-references: GB:AE002093; NID : g27 02278 ; PIDN : AAB91981 . 1 ; GSPDB : GN00139 
C; Genetics : 

A;Gene: At2g32840; T21L14.22; F24L7.2 
A;Map position: 2 

A;Introns: 185/3; 196/3; 214/2; 227/3; 253/3; 289/3; 316/2 

Query Match 55.1%; Score 49; DB 2; Length 346; 

Best Local Similarity 50.0%; Pred. No. 20; 

Matches 7; Conservative 4; Mismatches 3; Indels 0; Gaps 0; 

Qy 2 HQPPQPLPPTVMFP 15 

I I I I I I : : : : I 
Db 87 HQPPHPDPSSLIYP 100 



RESULT 12 
E87601 

OmpA family protein [imported] - Caulobacter crescentus 
C; Species: Caulobacter crescentus 

C;Date: 20-Apr-2001 #sequence_revision 20-Apr-2001 #text_change 20-Apr-2001 
C; Access ion: E87 601 

R;Nierman, W.C.; Feldblyum, T.V.; Paulsen, I.T.; Nelson, K.E.; Eisen, J.; 
Heidelberg, J.F.; Alley, M. ; Ohta, N.; Maddock, J.R.; Potocka, I.; Nelson, W.C. 
Newton, A.; Stephens, C; Phadke, N.D.; Ely, B.; Laub, M.T.; DeBoy, R.T.; 
Dodson, R.J.; Durkin, A.S,; Gwinn, M.L.; Haft, D.H.; Kolonay, J.F,; Smit, J.; 
Craven, M. ; Khouri, H.; Shetty, J.; Berry, K. ; Utterback, T.; 
A.; Vamathevan, J.; Ermolaeva, M. ; White, O.; Salzberg, S.L.; 
Venter, J.C.; Fraser, CM. 

Proc. Natl. Acad. Sci. U.S.A. 98, 4136-4141, 2001 

A;Title: Complete Genome Sequence of Caulobacter crescentus. 

A;Reference number: A87249; MUID : 2 1173698 ; PMID : 11259647 

A; Access ion: E87 601 

A; Status: preliminary 

A;Molecule type: DNA 

A; Residues: 1-191 <STO> 

A;Cross-references : GB:AE005673; NID : gl3424457 ; PIDN: AAK24809 . 1; GSPDB : GN0014 8 

C; Genetics : 

A; Gene: CC2 845 



Tran, K.; Wolf, 
Shapiro, L . ; 



Query Match 52 . 8%; 

Best Local Similarity 66.7%; 
Matches 8 ; Conservative 



Score 47; DB 2; 
Pred. No. 21; 
1; Mismatches 



Length 191; 
3; Indels 



0; Gaps 



0; 



Qy 

Db 



60 



PPQPLPPTVMFP 15 
I I I I I I I : I 
PPQPLPPAPLPP 71 



RESULT 13 
KBHU 

beta-casein precursor [validated] - human 
C; Species: Homo sapiens (man) 

C;Date: 30-Jun-1988 #sequence_revision 15-Aug-1997 #text_change 08-Dec-2000 
C;Accession: 153730; S08040; S04049; S11072; A27219; A30773 



R;Hansson, L.; Edlund, A.; Johansson, T.; Hernell, O.; Stromqvist, M. ; 
Lindquist, S.; Lonnerdal, B,; Bergstrom, S. 
Gene 139, 193-199, 1994 

A;Title: Structure of the human beta-casein encoding gene. 
A; Reference number: 153730; MUID : 94 156198 ; PMID: 8112603 
A; Accession: 153730 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A; Residues: 1-226 <RES> 

A;Cross-references: GB:L10615; NID : g2 695660 ; PIDN: AAC82978 . 1; PID:g2695661 
R;Menon, R.S. 

submitted to the EMBL Data Library, October 1989 
A;Reference number: S08040 
A; Accession: 3 08 04 0 
A;Molecule type: mRNA 

A;Residues: 1-132 , 'V*, 134-139, * Q ', 141-226 <MEN> 

A; Cross-references : EMBL: XI 7 07 0 

R;Menon, R.S.; Ham, R.G. 

Nucleic Acids Res. 17, 2869, 1989 

A;Title: Human beta-casein: partial cDNA sequence and apparent polymorphism. 

A;Reference number: S04049; MUID: 89240053; PMID:2717418 

A; Accession: S 04 04 9 

A;Molecule type: mRNA 

A; Residues: 161-226 <MEN2> 

A;Cross-references: EMBL:X13766; NID:g29673; PIDN: CAA32017 . 1 ; PID:g29674 
R;Loennerdal, B, ; Bergstroem, S.; Andersson, Y. ; Hjalmarsson, K. ; Sundqvist 
A.K. ; Hernell, O. 
FEBS Lett. 269, 153-156, 1990 

A; Title: Cloning and sequencing of a cDNA encoding human milk beta-casein. 

A;Reference number: S11072; MUID : 90353560 ; PMID:2387396 

A; Accession: SI 1072 

A; Status: preliminary 

A;Molecule type: mRNA 

A; Residues: 1-33,35-226 <LOE> 

A;Cross-references: GB:X55739; NID:g288097; PIDN : CAA39270 . 1 ; PID:g288098 
R;Greenberg, R. ; Groves, M.L.; Dower, H.J. 
J. Biol. Chem. 259, 5132-5138, 1984 

A; Title: Human beta-casein: amino acid sequence and identification of 
phosphorylation sites . 

A; Reference number: A27219; MUID: 84185624; PMID: 6715339 
A; Accession: A27219 
A;Molecule type: protein 

A; Residues: 16-29, 'P', 31-47, 'T»,49, 'Q', 51-119, 'QM21-148, 'SM50-172, 'EM74- 
181, »EM83, 'LM85-187, •VM89-206, ^P\208-213, 'PE',216, ' STTZABH 223-226 <GRE> 
C; Genetics : 

A; Gene: GDB:CSN2; CASB 

A; Cross-references : GDB: 125234; OMIM: 1154 60 
A;Map position: 4ql3-4q21 

A;Introns: 17/3; 26/3; 33/3; 48/3; 225/3 
C; Superfamily : beta-casein 

C; Keywords: calcium; milk; phosphoprotein 

F; 1-15/Domain: signal sequence #status predicted <SIG> 

F; 16-226/Product : beta-casein #status experimental <MAT> 

F;18/Binding site: phosphate (Thr) (covalent) #status experimental 

F;21,23,24,25/Binding site: phosphate (Ser) (covalent) #status experimental 



Query Match 



52.8%; Score 47; DB 1; Length 226; 



Best Local Similarity 53.3%; Pred. No. 25; 

Matches 8; Conservative 2; Mismatches 5; Indels 0; Gaps 0 



Qy 1 MHQPPQPLPPTVMFP 15 

I I I I I : I I : I 
Db 150 MQQVPQPIPQTLALP 164 

RESULT 14 
H83619 

hypothetical protein PA0197 [imported] - Pseudomonas aeruginosa (strain PAOl) 
C; Species; Pseudomonas aeruginosa 

C;Date: 15-Sep-2000 #sequence_revision 15-Sep-2000 #text_change 31-Dec-2000 
C; Accession: H83619 

R;Stover, C.K.; Pham, X.Q,; Erwin, A.L.; Mizoguchi, S.D.; Warrener, P.; Hickey 
M.J.; Brinkman, F.S.L.; Hufnagle, W.O.; Kowalik, D.J.; Lagrou, M. ; Garber, R.L 
Goltry, L.; Tolentino, E.; Westbrook-Wadman, S.; Yuan, Y. ; Brody, L.L.; Coulte 
S.N.; Folger, K.R.; Kas, A.; Larbig, K.; Lim, R.M. ; Smith, K.A. ; Spencer, D.H. 
Wong, G.K.S.; Wu, Z.; Paulsen, I.T.; Reizer, J.; Saier, M.H.; Hancock, R.E.W.; 
Lory, S.; Olson, M.V. 
Nature 406, 959-964, 2000 

A; Title: Complete genome sequence of Pseudomonas aeruginosa PAOl, an 
opportunistic pathogen. 

A;Reference number: A82950; MUID : 20437337 ; PMID : 10984 043 
A; Accession: H83619 
A; Status: preliminary 
A; Molecule type: DNA 
A; Residues: 1-270 <STO> 

A;Cross-references : GB:AE004458; GB:AE004091; NID: g9946031; PIDN : AAG0358 6 . 1 ; 

GSPDB:GN00131; PASP:PA0197 

A; Experimental source: strain PAOl 

C; Genetics : 

A; Gene: PA0197 

C; Superfamily : tonB protein 

Query Match 52.8%; Score 47; DB 2; Length 270; 

Best Local Similarity 66.7%; Pred. No. 29; 

Matches 8; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 4 PPQPLPPTVMFP 15 

11:11111 I 
Db 103 PPEPLPPWEEP 114 



RESULT 15 
S01919 

knirps protein - fruit fly (Drosophila melanogaster ) 
C; Species: Drosophila melanogaster 

C;Date: 31-Dec-1990 #sequence_revision 31-Dec-1990 #text_change 24-Sep-1998 
C;Accession: S01919; S02057 

R;Nauber, U. ; Pankratz, M.J.; Kienlin, A.; Seifert, E.; Klemm, U.; Jaeckle, H. 
Nature 336, 489-492, 1988 

A; Title: Abdominal segmentation of the Drosophila embryo requires a hormone 
receptor-like protein encoded by the gap gene knirps. 
A;Reference number: S01919; MUID : 89057148 ; PMID:2904128 
A;Accession: S01919 
A; Molecule type: DNA 



A; Residues: 1-429 <NAUl> 

A; Cross-references : EMBL:X13331 

R;Nauber, U. 

submitted to the EMBL Data Library^ October 1988 
A;Reference number: S02057 
A; Access ion: S 02 057 
A; Molecule type: DNA 

A; Residues: 1-106, ' L 108-429 <NAU2> 

A; Cross-references: EMBL:X13331; NID:g8153; PID:g8154 

C; Genetics : 

A; Gene: knirps 

A; Cross-references : FlyBase; FBgn0001320 
A;Introns: 26/3 

C; Keywords: DNA binding; nucleus; transcription regulation; zinc 



Query Match 52.8%; 
Best Local Similarity 57.1%; 
Matches 8; Conservative 

Qy 2 HQPPQPLPPTVMFP 15 

III I I I : : I I 

Db 183 HQSPFQLPPHLLFP 196 



Score 47; DB 2; Length 429; 
Pred. No. 46; 
2; Mismatches 4; Indels 0; Gaps 0; 



Search completed: August 24, 2004, 15:53:03 
Job time : 16.5522 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search^ using sw model 
Run on: August 24, 2004, 15:51:19 ; 



: Search time 54.291 Seconds 
(without alignments) 
86.825 Million cell updates/sec 



Title: US-0 9-641-8 01-34 

Perfect score: 89 

Sequence: 1 MHQPPQPLPPTVMFP 15 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched : 



1295152 seqs, 314255058 residues 



Total number of hits satisfying chosen parameters: 



1295152 



Minimum DB seq length: 
Maximum DB seq length: 



2000000000 



Post-processing: Minimum Match 0% 

Maximum Match 10 0% 
Listing first 45 summaries 



Database 



Published_Applications_AA: 

/cgn2_6/ptodata/l/pubpaa/US07_PUBCOMB.pep: ^ 
/cgn2_6/ptodata/l/pubpaa/PCT_NEW_PUB.pep: * 
/cgn2_6/ptodata/l/pubpaa/US06_NEW_PUB.pep: ^ 
/cgn2_6/ptodata/l/pubpaa/US06_PUBCOMB.pep: ^ 
/cgn2_6/ptodata/l/pubpaa/US07_NEW_PUB.pep: ^ 
/ cgn2_6/ptodata/ 1/pubpaa/ PCTUS_PUBCOMB . pep : 
/cgn2_6/ptodata/l/pubpaa/US08_NEW_PUB.pep: * 
/cgn2_6/ptodata/l/pubpaa/US08_PUBCOMB.pep: * 
/cgn2_6/ptodata/ 1/pubpaa/USO 9A_PUBC0MB . pep : 
/cgn2_6/ptodata/l/pubpaa/US09B_PUBCOMB.pep 
/cgn2_6/ptodata/l/pubpaa/US09C_PUBCOMB.pep 
/cgn2_6/ptodata/l/pubpaa/US09_NEW_PUB.pep: 
/cgn2_6/ptodata/l/pubpaa/US10A_PUBCOMB.pep 
/cgn2_6/ptodata/l/pubpaa/US10B_PUBCOMB.pep 
/cgn2_6/ptodata/l/pubpaa/US10C_PUBCOMB.pep 
/ cgn2_6/ptodata/ l/pubpaa/US10_NEW_PUB . pep : 
/cgn2_6/ptodata/ l/pubpaa/US60_NEW_PUB , pep : 
/cgn2 6/ptodata/ 1/pubpaa/US 6Q_PUBC0MB . pep : 
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9 
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11 
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13 
14 
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18 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 

US-10-281-652-34 

; Sequence 34, Application US/10281652 

; Publication No. US20030091606A1 

; GENERAL INFORMATION: 

; APPLICANT: STANTON, G. John 



; APPLICANT: HUGHES, Thomas K. 
; APPLICANT: BOLDOGH, Istvan 

TITLE OF INVENTION: USE OF COLOSTRININ, CONSTITUENT PEPTIDES THEREOF, AND 
; TITLE OF INVENTION: ANALOGS THEREOF AS OXIDATIVE STRESS REGULATORS 
; FILE REFERENCE: 265.00220101 
; CURRENT APPLICATION NUMBER: US/ 1 0/2 8 1 , 652 
; CURRENT FILING DATE: 2002-10-28 
; PRIOR APPLICATION NUMBER: US/09/641,803 
; PRIOR FILING DATE: 2000-08-17 
; PRIOR APPLICATION NUMBER: 60/149,310 
; PRIOR FILING DATE: 1999-08-17 
; NUMBER OF SEQ ID NOS : 34 

SOFTWARE: Patent In Ver. 2.1 
; SEQ ID NO 34 
LENGTH: 15 
TYPE: PRT 

ORGANISM: Artificial Sequence 
; FEATURE : 

; OTHER INFORMATION: Description of Artificial Sequence: synthetic 

OTHER INFORMATION: peptide 
US-10-281-652-34 

Query Match 100.0%; Score 89; DB 14; Length 15; 

Best Local Similarity 100.0%; Pred, No. 0.0004; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMFP 15 

M I I I I I I I I I M I I 
Db 1 MHQPPQPLPPTVMFP 15 



RESULT 2 

US-10-281-652-25 

; Sequence 25, Application US/10281652 

; Publication No. US20030091606A1 

; GENERAL INFORMATION: 

; APPLICANT: STANTON, G. John 

; APPLICANT: HUGHES, Thomas K. 

; APPLICANT: BOLDOGH, Istvan 

; TITLE OF INVENTION: USE OF COLOSTRININ, CONSTITUENT PEPTIDES THEREOF, AND 
TITLE OF INVENTION: ANALOGS THEREOF AS OXIDATIVE STRESS REGULATORS 

; FILE REFERENCE: 265.00220101 

; CURRENT APPLICATION NUMBER: US/ 10/2 8 1 , 652 
CURRENT FILING DATE: 2002-10-28 

; PRIOR APPLICATION NUMBER: US/ 09/ 64 1 , 8 03 

; PRIOR FILING DATE: 2000-08-17 

PRIOR APPLICATION NUMBER: 60/149,310 

; PRIOR FILING DATE: 1999-08-17 

; NUMBER OF SEQ ID NOS: 34 

; SOFTWARE: Patentin Ver. 2.1 

; SEQ ID NO 25 
LENGTH: 10 
TYPE: PRT 

; 0RG7\NISM: Artificial Sequence 
FEATURE : 

OTHER INFORMATION: Description of Artificial Sequence: synthetic 
OTHER INFORMATION: peptide 



US-10-281-652-25 



Query Match 64.0%; Score 57; DB 14; Length 10; 

Best Local Similarity 100.0%; Pred. No. 1.5; 

Matches 10; Conservative 0; Mismatches 0; Indels 

Qy 6 QPLPPTVMFP 15 

I M I I I I I I I 
Db 1 QPLPPTVMFP 10 



RESULT 3 

US-10-4 0 8-765A-108 0 

Sequence 1080, Application US/10408765A 
Publication No. US2004 0101874A1 
GENERAL INFORMATION: 
APPLICANT: Ghosh, Soumitra S. 
APPLICANT: Fahy, Eoin D. 
APPLICANT: Zhang, Bing 
APPLICANT: Gibson, Bradford W. 
APPLICANT: Taylor, Steven W. 
APPLICANT: Glenn, Gary M, 
APPLICANT: Warnock, Dale E. 

TITLE OF INVENTION: TARGETS FOR THERAPEUTIC INTERVENTION 
TITLE OF INVENTION: IDENTIFIED IN THE MITOCHONDRIAL PROTEOME 
FILE REFERENCE: 660088.465 

CURRENT APPLICATION NUMBER: US/10/408 , 7 65A 
CURRENT FILING DATE: 2003-04-04 
NUMBER OF SEQ ID NOS : 3077 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 1080 
LENGTH: 412 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-408-765A-1080 



Query Match 62.9%; Score 56; DB 16; Length 412; 

Best Local Similarity 69.2%; Pred. No. 57; 

Matches 9; Conservative 1; Mismatches 3; Indels 



Qy 

Db 



3 QPPQPLPPTVMFP 15 
I I I I I I I : I I 
15 QPPQPAPPPPLFP 27 



RESULT 4 

US-10-4 37-9 63-152 994 

Sequence 152994, Application US/10437963 
Publication No. US2 0040123343A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



La Rosa, Thomas J. 
Kovalic, David K. 
Zhou, Yihua 
Cao, Yongwei 
Wu, Wei 

Boukharov, Andrey A. 
Barbazuk, Brad 



; APPLICANT: Li, Ping 

; TITLE OF INVENTION: Rice Nucleic Acid Molecules and Other Molecules 
Associated With 

; TITLE OF INVENTION: Plants and Uses Thereof for Plant Improvement 

; FILE REFERENCE: 38-21(53221) B 

; CURRENT APPLICATION NUMBER: US/ 1 0/ 437 , 9 63 

; CURRENT FILING DATE: 2003-05-14 

; NUMBER OF SEQ ID NOS : 204966 

; SEQ ID NO 152994 

LENGTH: 24 8 

TYPE : PRT 

ORGANISM: Oryza sativa 
; FEATURE : 

NAME/KEY: unsure 
LOCATION: (1) . . (248) 

OTHER INFORMATION: unsure at all Xaa locations 
FEATURE : 

; OTHER INFORMATION: Clone ID: PAT_MRT4 53 0_52 991C . 1 . pep 
US-10-437-963-152994 

Score 55; DB 16; Length 248; 
Pred. No, 47; 
1; Mismatches 2; Indels 0; Gaps 0; 

Qy 4 PPQPLPPTVMFP 15 

I I I I M I : I I 
Db 134 PPQPPPPTIMAP 145 



Query Match 61.8%; 
Best Local Similarity 75.0%; 
Matches 9; Conservative 



RESULT 5 
US-09-794-346-1 

; Sequence 1, Application US/09794346 
; Patent No. US20010031857A1 
; GENERAL INFORMATION: 

; APPLICANT: Aventis Pharma Deutschland GmbH 

; TITLE OF INVENTION: Memno Peptides, Process for Their Preparation and Use 
Thereof 

; FILE REFERENCE: 02481.1728 

; CURRENT APPLICATION NUMBER: US/ 09/ 7 94 , 34 6 

CURRENT FILING DATE: 2001-02-28 
; PRIOR APPLICATION NUMBER: EP 00104114.4 
; PRIOR FILING DATE: 2000-02-29 
; PRIOR APPLICATION NUMBER: PCT/EP 01/01661 
; PRIOR FILING DATE: 2001-02-15 
; NUMBER OF SEQ ID NOS: 1 

SOFTWARE: Patentin version 3.0 
; SEQ ID NO 1 

LENGTH: 10 

TYPE: PRT 

ORGANISM: artificial sequence 
; FEATURE : 

NAME/ KEY: misc_feature 

OTHER INFORMATION: Description of Artificial Sequence: Memnoniella 
echinata, FH 227 

OTHER INFORMATION: 1, DSM 1319 
US-09-794-346-1 



Query Match 59.6%; Score 53; DB 9; Length 10; 

Best Local Similarity 90.0%; Pred. No. 4.5; 

Matches 9; Conservative 0; Mismatches 1; Indels 0; Gaps 

Qy 1 MHQPPQPLPP 10 

I I I I I I I I I 
Db 1 MHQPHQPLPP 10 



RESULT 6 
US-10-380-371-1 

Sequence 1, Application US/10380371 
Publication No. US20040086965A1 
GENERAL INFORMATION: 
APPLICANT: GLAUS, JUAN 
APPLICANT: COMINI , MARCELO 
APPLICANT: TONARELLI, GEORGINA 
APPLICANT: PERIN, JUAN CARLOS 
APPLICANT: SALVETTI, JORGE LUIS 
APPLICANT : FRANK, RONALD 

TITLE OF INVENTION: CASEIN PEPTIDE FRAGMENTS HAVING GROWTH- INFLUENCING 
TITLE OF INVENTION: ACTIVITY ON CELL CULTURES 
FILE REFERENCE: 930008-2096 
CURRENT APPLICATION NUMBER: US/10/38 0,371 
CURRENT FILING DATE: 2003-10-22 
PRIOR APPLICATION NUMBER: PCT/DEO 1/ 03 84 9 
PRIOR FILING DATE: 2001-10-09 
PRIOR APPLICATION NUMBER: DE 10050091.9 
PRIOR FILING DATE: 2000-10-09 
NUMBER OF SEQ ID NOS : 1 
SOFTWARE: PatentlnVer. 2.1 
SEQ ID NO 1 
LENGTH: 10 
TYPE: PRT 

ORGANISM: Artificial Sequence 
FEATURE : 

OTHER INFORMATION: Description of Artificial Sequence: Synthetic 
OTHER INFORMATION: peptide 
FEATURE : 

NAME/KEY: misc_f eature 
LOCATION: (1) 

OTHER INFORMATION: N-term may be acetylated 
US-10-380-371-1 



Query Match 59.6%; Score 53; DB 16; Length 10; 

Best Local Similarity 90.0%; Pred. No. 4.5; 

Matches 9; Conservative 0; Mismatches 1; Indels 



0; Gaps 



Qy 

Db 



2 HQPPQPLPPT 11 

III I I I I I I 
1 HQPHQPLPPT 10 



RESULT 7 

US-09-867-550-978 

; Sequence 978, Application US/09867550 
; Patent No. US20020082206A1 



GENERAL INFORMATION: 
APPLICANT: Leach, Martin D. 
APPLICANT: Mehraban, Fuad, 
APPLICANT: Conley, Pamela 
APPLICANT: Law, Debbie 
APPLICANT: Topper, James 

TITLE OF INVENTION: No. US2 0 02 0 0822 06Alel Polynucleotides from Atherogenic 
Cells and Polypeptides Encoded 
TITLE OF INVENTION: Thereby 
FILE REFERENCE: 21402-013 (Cura-313) 
CURRENT APPLICATION NUMBER: US/ 09/ 8 67 , 550 
CURRENT FILING DATE: 2001-09-20 
PRIOR APPLICATION NUMBER: USSN 60/208,427 
PRIOR FILING DATE: 2000-05-30 
NUMBER OF SEQ ID NOS : 2125 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 978 
LENGTH: 60 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-867-550-978 



Query Match 59.6%; 
Best Local Similarity 88.9%; 
Matches 8; Conservative 



Score 53; DB 9; 
Pred. No. 23; 
1; Mismatches 



Length 60; 
0; Indels 



0; Gaps 



0; 



Qy 

Db 



2 HQPPQPLPP 10 
I : M I I I I I 
38 HRPPQPLPP 46 



RESULT 8 

US-1 0-437-9 63- 18537 0 

Sequence 185370, Application US/10437963 
Publication No. US20040123343A1 
GENERAL INFORMATION: 
APPLICANT: La Rosa, Thomas J. 
APPLICANT: Kovalic, David K. 
APPLICANT: Zhou, Yihua 
APPLICANT: Cao, Yongwei 
APPLICANT: Wu, Wei 
APPLICANT: Boukharov, Andrey A. 
APPLICANT: Barbazuk, Brad 
APPLICANT: Li, Ping 

TITLE OF INVENTION: Rice Nucleic Acid Molecules and Other Molecules 
Associated With 

TITLE OF INVENTION: Plants and Uses Thereof for Plant Improvement 
FILE REFERENCE: 38-2 1 ( 53221 ) B 
CURRENT APPLICATION NUMBER: US/10/437 , 963 
CURRENT FILING DATE: 2003-05-14 
NUMBER OF SEQ ID NOS: 2 04 966 
SEQ ID NO 185370 
LENGTH: 2 683 
TYPE: PRT 

ORGANISM: Oryza sativa 
FEATURE : 

NAME/KEY: unsure 



LOCATION: (1) . . (2683) 
; OTHER INFORMATION: unsure at all Xaa locations 
FEATURE : 

OTHER INFORMATION: Clone ID: PAT_MRT4 530_82272C . 1 . pep 
US- 10-4 37- 963-1 8 5370 

Query Match 59.6%; Score 53; DB 16; Length 2683; 

Best Local Similarity 64.3%; Pred. No. 6,8e+02; 

Matches 9; Conservative 1; Mismatches 4; Indels 0; Gaps 0; 

Qy 2 HQPPQPLPPTVMFP 15 

I I I I I I I I : I 
Db 2646 HQPPVPLHPTIPXP 2659 



RESULT 9 

US-10-051-874-56 

Sequence 56, Application US/10051874 
Publication No. US2 004 0005557A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Padigaru, Muralidhara 
Alsobrook II, John P 
Colman, Steven D 

Kimberly A 
Ferenc 
Corine 7\M 



Spytek, 
Boldog, 
Vernet, 
Li, Li 
Shenoy, 
Gasman, 



Suresh G 
Stacie J 
Guo, Xiaojia Sasha 
Edinger, Shlomit R 
MacDougall, John R 
Malyankar, Uriel M 
Patturajan, Meera 
Shimkets, Richard A 
Pena, Carol EA 
Tchernev, Velizar T 
Zerhusen, Bryan D 
Millet, Isabelle 
Miller, Charles E 
Lepley, Denise M 
Smithson, Glennda 
Baumgartner , Jason C 
Herrman, John L 
Peyman, John A 
Gorman, Linda 
Mezes, Peter D 
Kekuda, Ramesh 
Taupier Jr, Raymond J 
Gerlach, Valerie 
Grosse, William M 
Liu, Xiaohong 
Ellerman, Karen 
Rothenberg, Mark 
Stone, David J 
Burgess, Catherine E 



TITLE OF INVENTION: PROTEINS, POLYNUCLEOTIDES ENCODING THEM AND METHODS OF 



; TITLE OF INVENTION: USING THE SAME 

FILE REFERENCE: 21402-245 
; CURRENT APPLICATION NUMBER: US/ 1 0/ 05 1 , 8 74 

CURRENT FILING DATE: 2002-09-25 
; PRIOR APPLICATION NUMBER: 60/268,595 
; PRIOR FILING DATE: 2001-02-14 
; PRIOR APPLICATION NUMBER: 60/325,306 
; PRIOR FILING DATE: 2001-09-27 
; PRIOR APPLICATION NUMBER: 60/262,587 
; PRIOR FILING DATE: 2001-01-18 
; PRIOR APPLICATION NUMBER: 60/272,409 

PRIOR FILING DATE: 2001-02-28 
; PRIOR APPLICATION NUMBER: 60/262,454 
; PRIOR FILING DATE: 2001-01-18 
; PRIOR APPLICATION NUMBER: 60/276,777 

PRIOR. FILING DATE: 2001-03-16 
; PRIOR APPLICATION NUMBER: 60/291,672 
; PRIOR FILING DATE: 2001-05-17 
; PRIOR APPLICATION NUMBER: 60/330,336 
; PRIOR FILING DATE: 2001-10-18 
; PRIOR APPLICATION NUMBER: 60/265,530 
; PRIOR FILING DATE: 2001-01-31 
; PRIOR APPLICATION NUMBER: 60/261,376 
; PRIOR FILING DATE: 2001-01-16 
; NUMBER OF SEQ ID NOS : 2 69 
; SOFTWARE: PatentlnVer. 2.1 
; SEQ ID NO 56 

LENGTH: 4 952 
TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-051-874-56 

Query Match 59.6%; Score 53; DB 15; Length 4952; 

Best Local Similarity 57.1%; Pred. No. 1.2e+03; 

Matches 8; Conservative 3; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMF 14 

: I : I I : I II I I 
Db 1886 LHKPPRPQPPEVAF 1899 



RESULT 10 
US-10-051-874-166 

Sequence 166, Application US/10051874 
Publication No. US2004 0005557A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Padigaru, Muralidhara 
Al sob rook II , John P 
Colman, Steven D 
Spytek, Kimberly A 
Ferenc 
Corine AM 



Boldog, 
Vernet , 
Li, Li 
Shenoy, 
Gasman, 



Suresh G 
Stacie J 
Guo, Xiaojia Sasha 
Edinger, Shlomit R 



APPLICANT: MacDougall, John R 
APPLICANT: Malyankar, Uriel M 
APPLICANT: Patturajan, Meera 
APPLICANT: Shimkets, Richard A 
APPLICANT: Pena, Carol EA 
APPLICANT: Tchernev, Velizar T 
APPLICANT: Zerhusen^ Bryan D 
APPLICANT: Millet, Isabella 
APPLICANT: Miller, Charles E 
APPLICANT: Lepley, Denise M 
APPLICANT: Smithson, Glennda 
APPLICANT: Baumgartner , Jason C 
APPLICANT: Herrman, John L 
APPLICANT: Peyman, John A 
APPLICANT: Gorman, Linda 
APPLICANT: Mezes, Peter D 
APPLICANT: Kekuda, Ramesh 
APPLICANT: Taupier Jr, Raymond J 
APPLICANT: Gerlach, Valerie 
APPLICANT: Grosse, William M 
APPLICANT: Liu, Xiaohong 
APPLICANT: Ellerman, Karen 
APPLICANT: Rothenberg, Mark 
APPLICANT: Stone, David J 
APPLICANT: Burgess, Catherine E 

TITLE OF INVENTION: PROTEINS, POLYNUCLEOTIDES ENCODING THEM AND METHODS OF 
TITLE OF INVENTION: USING THE SAME 
FILE REFERENCE: 21402-245 

CURRENT APPLICATION NUMBER: US/ 1 0/ 051 , 874 
CURRENT FILING DATE: 2002-09-25 



PRIOR 


APPLICATION 


NUMBER: 


60/268,595 


PRIOR 


FILING DATE 


2001-02 


-14 




PRIOR 


APPLICATION 


NUMBER: 


60/325, 


306 


PRIOR 


FILING DATE 


2001-09 


-27 




PRIOR 


APPLICATION 


NUMBER: 


60/262, 


587 


PRIOR 


FILING DATE 


2001-01 


-18 




PRIOR 


APPLICATION 


NUMBER: 


60/272, 


409 


PRIOR 


FILING DATE 


2001-02 


-28 




PRIOR 


APPLICATION 


NUMBER: 


60/262, 


454 


PRIOR 


FILING DATE 


2001-01 


-18 




PRIOR 


APPLICATION 


NUMBER: 


60/276, 


777 


PRIOR 


FILING DATE 


2001-03 


-16 




PRIOR 


APPLICATION 


NUMBER: 


60/291, 


672 


PRIOR 


FILING DATE 


2001-05 


-17 




PRIOR 


APPLICATION 


NUMBER: 


60/330, 


336 


PRIOR 


FILING DATE 


2001-10 


-18 




PRIOR 


APPLICATION 


NUMBER : 


60/265, 


530 


PRIOR 


FILING DATE 


2001-01 


-31 




PRIOR 


APPLICATION 


NUMBER: 


60/261, 


376 


PRIOR 


FILING DATE 


2001-01 


-16 





NUMBER OF SEQ ID NOS : 269 
SOFTWARE: Patentin Ver. 2. 
SEQ ID NO 166 
LENGTH: 5008 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-051-B74-166 



Query Match 59.6%; Score 53; DB 15; Length 5008; 

Best Local Similarity 57.1%; Pred. No. 1.2e+03; 

Matches 8; Conservative 3; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMF 14 

: I : I I : I II I I 
Db 1937 LHKPPRPQPPEVAF 1950 



RESULT 11 
US-10-085-198-112 

; Sequence 112, Application US/10085198 

; Publication No. US2 004 0009907A1 

; GENERAL INFORMATION: 

; APPLICANT: Alsobrook et al . 

; TITLE OF INVENTION: Proteins and Nucleic Acids Encoding Same 

; FILE REFERENCE: 21402-279 

; CURRENT APPLICATION NUMBER: US/10/085, 198 

; CURRENT FILING DATE: 2002-02-25 

; PRIOR APPLICATION NUMBER: 60/271,646 

; PRIOR FILING DATE: 2001-02-26 

; PRIOR APPLICATION NUMBER: 60/276,401 

; PRIOR FILING DATE: 2001-03-16 

; PRIOR APPLICATION NUMBER: 60/311,981 

; PRIOR FILING DATE: 2001-08-13 

PRIOR APPLICATION NUMBER: 60/312,858 

; PRIOR FILING DATE: 2001-08-16 

; PRIOR APPLICATION NUMBER: 60/271,840 

; PRIOR FILING DATE: 2001-02-27 

; PRIOR APPLICATION NUMBER: 60/277,324 

; PRIOR FILING DATE: 2001-03-20 

; PRIOR APPLICATION NUMBER: 60/286,096 

; PRIOR FILING DATE: 2001-04-21 

; PRIOR APPLICATION NUMBER: 60/299,695 

; PRIOR FILING DATE: 2001-06-20 

; PRIOR APPLICATION NUMBER: 60/315,614 

; PRIOR FILING DATE: 2001-08-29 

; PRIOR APPLICATION NUMBER: 60/272,405 

; PRIOR FILING DATE: 2001-02-28 

; Remaining Prior Application data removed - See File Wrapper or PALM. 

; NUMBER OF SEQ ID NOS : 653 

; SOFTWARE: PatentlnVer. 2.1 

; SEQ ID NO 112 

; LENGTH: 5159 

TYPE: PRT 

; ORGANISM: Homo sapiens 
US-10-085-198-112 

Query Match 59.6%; Score 53; DB 15; Length 5159; 

Best Local Similarity 57.1%; Pred. No. 1.2e+03; 

Matches 8; Conservative 3; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMF 14 

: I : II : I II I I 
Db 1886 LHKPPRPQPPEVAF 1899 



RESULT 12 
US-10-051-874-165 

Sequence 165, Application US/10051874 
Publication No. US2004 0005557A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICT^T 
APPLICJ^NT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICT^T 
APPLIC7\NT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Padigaru, Muralidhara 
Alsobrook II, John P 
Colman, Steven D 

Kimberly A 
Ferenc 
Corine AM 



Spytek, 
Boldog, 
Vernet , 
Li, Li 
Shenoy, 
Casman, 



Suresh G 
Stacie J 
Guo, Xiaojia Sasha 
Edinger, Shlomit R 
MacDougall, John R 
Malyankar, Uriel M 
Patturajan, Meera 
Shimkets, Richard A 
Pena, Carol EA 
Tchernev, Velizar T 
Zerhusen, Bryan D 
Millet, Isabelle 
Miller, Charles E 
Lepley, Denise M 
Smithson, Glennda 
Baumgartner , Jason C 
Herrman, John L 
Peyman, John A 
Gorman, Linda 
Mazes, Peter D 
Kekuda, Ramesh 
Taupier Jr, Raymond J 
Gerlach, Valerie 
Grosse, William M 
Liu, Xiaohong 
Ellerman, Karen 
Rothenberg, Mark 
Stone, David J 
Burgess, Catherine E 
TITLE OF INVENTION: PROTEINS, POLYNUCLEOTIDES ENCODING THEM AND METHODS OF 
TITLE OF INVENTION: USING THE SAME 
FILE REFERENCE: 21402-245 

CURRENT APPLICATION NUMBER: US/10/051, 874 

CURRENT FILING DATE: 2002-09-25 

PRIOR APPLICATION NUMBER: 60/268,595 

PRIOR FILING DATE: 2001-02-14 

PRIOR APPLICATION NUMBER: 60/325,306 

PRIOR FILING DATE: 2001-09-27 

PRIOR APPLICATION NUMBER: 60/262,587 

PRIOR FILING DATE: 2001-01-18 

PRIOR APPLICATION NUMBER: 60/272,409 

PRIOR FILING DATE: 2001-02-28 

PRIOR APPLICATION NUMBER: 60/262,454 

PRIOR FILING DATE: 2001-01-18 



; PRIOR APPLICATION NUMBER: 60/276,777 

PRIOR FILING DATE: 2001-03-16 
; PRIOR APPLICATION NUMBER: 60/291,672 
; PRIOR FILING DATE: 2001-05-17 
; PRIOR APPLICATION NUMBER: 60/330,336 
; PRIOR FILING DATE: 2001-10-18 
; PRIOR APPLICATION NUMBER: 60/265,530 
; PRIOR FILING DATE: 2001-01-31 
; PRIOR APPLICATION NUMBER: 60/261,376 
; PRIOR FILING DATE: 2001-01-16 
; NUMBER OF SEQ ID NOS : 2 69 

SOFTWARE: Patentin Ver. 2.1 
; SEQ ID NO 165 
; LENGTH: 5262 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-051-874-165 



Query Match 59.6%; 
Best Local Similarity 57.1%; 
Matches 8; Conservative 

Qy 1 MHQPPQPLPPTVMF 14 

: I : I I : I I I I I 
Db 2191 LHKPPRPQPPEVAF 2204 



Score 53; DB 15; Length 5262; 
Pred. No. 1.2e+03; 
3; Mismatches 3; Indels 



RESULT 13 
US-10-051-874-167 

Sequence 167, Application US/10051874 
Publication No. US20040005557A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICT^NT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLIC7\NT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Padigaru, Muralidhara 
Alsobrook II, John P 
Colman, Steven D 

Kimberly A 
Ferenc 
Corine AM 



Spytek, 
Boldog, 
Vernet , 
Li, Li 
Shenoy, 
Gasman, 



Suresh G 
Stacie J 
Guo, Xiaojia Sasha 
Edinger, Shlomit R 
MacDougall, John R 
Malyankar, Uriel M 
Patturajan, Meera 
Shimkets, Richard ? 
Pena, Carol EA 
Tchernev, Velizar I 
Zerhusen, Bryan D 
Millet, Isabelle 
Miller, Charles E 
Lepley, Denise M 
Smithson, Glennda 
Baumgartner, Jason 
Herrman, John L 
Peyman, John A 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Gorman, Linda 
Mezes, Peter D 
Kekuda, Ramesh 
Taupier Jr, Raymond J 
Gerlach, Valerie 
Grosse, William M 
Liu, Xiaohong 
Ellerman, Karen 
Rothenberg, Mark 
Stone, David J 
Burgess, Catherine E 
TITLE OF INVENTION: PROTEINS, POLYNUCLEOTIDES ENCODING THEM iUTD METHODS 
TITLE OF INVENTION: USING THE SAME 
FILE REFERENCE: 21402-245 

CURRENT APPLICATION NUMBER: US/ 1 0/ 051 , 87 4 
CURRENT FILING DATE: 2002-09-25 
PRIOR APPLICATION NUMBER: 60/268,595 
PRIOR FILING DATE: 2001-02-14 
PRIOR APPLICATION NUMBER: 60/325,306 
PRIOR FILING DATE: 2001-09-27 
PRIOR APPLICATION NUMBER: 60/262,587 
PRIOR FILING DATE: 2001-01-18 
PRIOR APPLICATION NUMBER: 60/272,409 
PRIOR FILING DATE: 2001-02-28 
PRIOR APPLICATION NUMBER: 60/262,454 
PRIOR FILING DATE: 2001-01-18 
PRIOR APPLICATION NUMBER: 60/276,777 
PRIOR FILING DATE: 2001-03-16 
PRIOR APPLICATION NUMBER: 60/291,672 
PRIOR FILING DATE: 2001-05-17 
PRIOR APPLICATION NUMBER: 60/330,336 
PRIOR FILING DATE: 2001-10-18 
PRIOR APPLICATION NUMBER: 60/265,530 
PRIOR FILING DATE: 2001-01-31 
PRIOR APPLICATION NUMBER: 60/261,376 
PRIOR FILING DATE: 2001-01-16 
NUMBER OF SEQ ID NOS : 269 
SOFTWARE: PatentlnVer. 2.1 
SEQ ID NO 167 
LENGTH: 5262 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-051-874-167 

Query Match 59.6%; Score 53; DB 15; Length 5262; 

Best Local Similarity 57.1%; Pred. No. 1.2e+03; 

Matches 8; Conservative 3; Mismatches 3; Indels 0; Gaps 

Qy 1 MHQPPQPLPPTVMF 14 

: I : I I : I II I I 
Db 2191 LHKPPRPQPPEVAF 2204 



RESULT 14 

US-10-437-963-162903 

; Sequence 162903, Application US/10437963 
; Publication No. US20040123343A1 



GENERAL INFORMATION: 
APPLIC7\NT: La Rosa, 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 



Thomas J. 
Kovalic, David K. 
Zhou, Yihua 
Cao, Yongwei 
Wu, Wei 

Boukharov, Andrey A. 
Barbazuk, Brad 
Li, Ping 

TITLE OF INVENTION: Rice Nucleic Acid Molecules and Other Molecules 
Associated With 

TITLE OF INVENTION: Plants and Uses Thereof for Plant Improvement 
FILE REFERENCE: 38-2 1 ( 5322 1 ) B 
CURRENT APPLICATION NUMBER: US/ 10/ 437 , 963 
CURRENT FILING DATE: 2003-05-14 
NUMBER OF SEQ ID NOS : 204966 
SEQ ID NO 162903 
LENGTH: 64 3 
TYPE: PRT 

ORGANISM: Oryza sativa 
FEATURE : 

OTHER INFORMATION: Clone ID: PAT_MRT4 530_6194C . 1 . pep 
US-10-437-963-162903 



Query Match 58.4%; Score 52; DB 16; Length 643; 

Best Local Similarity 75.0%; Pred. No. 2.5e+02; 

Matches 9; Conservative 0; Mismatches 3; Indels 



0; Gaps 



0; 



Qy 

Db 



PPQPLPPTVMFP 15 

II I III III 
PPPPPPPTKMFP 99 



RESULT 15 

US-10-437-963-161687 

Sequence 161687, Application US/10437963 
Publication No. US2 004 0123343A1 
GENERAL INFORMATION: 
APPLICANT: La Rosa, Thomas J. 
APPLICANT: Kovalic, David K. 
APPLICANT: Zhou, Yihua 
APPLICANT: Cao, Yongwei 
APPLICANT: Wu, Wei 
APPLICANT: Boukharov, Andrey A. 
APPLICANT: Barbazuk, Brad 
APPLICANT: Li, Ping 

TITLE OF INVENTION: Rice Nucleic Acid Molecules and Other Molecules 
Associated With 

TITLE OF INVENTION: Plants and Uses Thereof for Plant Improvement 
FILE REFERENCE: 38-2 1 ( 5322 1 ) B 
CURRENT APPLICATION NUMBER: US/ 10/ 4 37 , 963 
CURRENT FILING DATE: 2003-05-14 
NUMBER OF SEQ ID NOS: 204966 
SEQ ID NO 161687 
LENGTH: 95 
TYPE: PRT 

ORGANISM: Oryza sativa 



FEATURE : 

OTHER INFORMATION: Clone ID: PAT_MRT4530_6084 9C . 1 . pep 
US-10-437-963-161687 

Query Match 57.3%; Score 51; DB 16; Length 95; 

Best Local Similarity 80.0%; Pred. No. 59; 

Matches 8; Conservative 1; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 HQPPQPLPPT 11 

I : I I I I III 
Db 53 HKPPQPPPPT 62 



Search completed: August 24, 2004, 16:41:30 
Job time : 55.291 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



August 24, 2004, 15:23:00 ; Search time 46.3433 Seconds 

(without alignments) 
102.124 Million cell updates/sec 

US-09-641-801-34 
89 

1 MHQPPQPLPPTVMFP 15 
BLOSUM62 

Gapop 10.0 , Gapext 0.5 

1017041 seqs, 315518202 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 10 0% 
Listing first 45 summaries 



1017041 



Database : SPTREMBL_25 : ^ 

1: sp_archea:* 
2: sp_bacteria : * 
3 : sp_f ungi : * 
4 : sp_human : * 
5: sp_invertebrate : * 
6 : sp_mammal : * 
7 : sp_mhc : * 
8: sp__organelle : * 
9: sp_phage;* 
10: sp_plant:^ 
11: sp_rodent:* 
12 : sp_virus : * 
13: sp_vertebrate : * 
14: sp_unclassif led: * 
15: sp_rvirus : * 
16: sp_bacteriap : * 
17: sp_archeap : 

Fred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result Query 

No. Score Match Length DB ID 



Description 



1 


83 


93 . 


3 


141 


6 


Q27953 


Q27953 balaenopter 


2 


81 


91 . 


0 


141 


6 


Q27939 


Q27939 alces alces 


3 


81 


91 . 


0 


141 


6 


Q27938 


Q27938 antilocapra 


4 


81 


91 . 


0 


223 


6 


Q95L76 


Q95176 capra hircu 




80 


89 . 


9 


146 


6 


Q9BDG5 


Q9bdg5 bos taurus 




78 


87 . 


6 


141 


6 


P79231 


P79231 physeter ca 


7 


75 


84 . 


3 


141 


6 


Q28418 


Q28418 giraffa cam 


8 


65 


73 . 


0 


141 


6 


Q28355 


Q28355 delphinapte 


Q 


63 


70 . 


8 


72 


6 


Q28795 


Q28795 tayassu taj 


10 


56 


62 . 


9 


412 


4 


Q9H5E0 


Q9h5e0 homo sapien 


1 1 


55 


61 . 


8 


117 


6 


Q28442 


Q28442 hippopotamu 


12 


54 


60 . 


, 7 


141 


6 


Q29136 


Q29136 tapirus ind 


13 


53 


59 . 


, 6 


145 


6 


Q29151 


Q29151 uncia uncia 


14 


53 


59 . 


, 6 


421 


4 


Q8TC66 


Q8tc66 homo sapien 


1 s 


52 


58 . 


, 4 


250 


6 


Q9N2G8 


Q9n2g8 canis famil 


J. u 


52 


58 . 


, 4 


1596 


5 


Q9VWC6 


Q9vwc6 drosophila 


17 


52 


58 . 


, 4 


1596 


5 


Q7YU17 


Q7yul7 drosophila 


18 


51 


57 , 


, 3 


2876 


10 


Q9FE41 


Q9fe41 oryza sativ 




50 


56 . 


, 2 


169 


5 


Q8SVY7 


Q8svy7 encephalito 


9 n 




56 . 


, 2 


2 87 


4 


09H6N8 


Q9h6n8 homo sapien 


9 1 




Sfi 


9 


295 


5 


Q21781 


Q21781 caenorhabdi 


9 9 




56 . 


, 2 


633 


4 


Q9 6RR9 


Q96rr9 homo sapien 


9 ? 


50 


56 . 


, 2 


686 


3 


Q872B3 


Q872b3 neurospora 


94 


sn 


56 , 


, 2 


1270 


4 


096 JH2 


Q96jh2 homo sapien 


9 S 


49 


55 . 


. 1 


211 


10 


Q84PY7 


Q84py7 oryza sativ 


9 


49 


55 . 


, 1 


292 


5 


062458 


062458 caenorhabdi 


97 


4 Q 


55 . 


, 1 


337 


10 


08GYK6 


Q8gyk6 arabidopsis 


9 R 


49 


55 , 


^ 1 


34 6 


10 


050068 


050068 arabidopsis 


9 Q 


49 


55 . 


, 1 


818 


11 


Q8C8U8 


Q8c8u8 mus musculu 


30 


4 9 


55 . 


, 1 


1385 


13 


Q7ZZ45 


Q7zz45 brachydanio 


O J. 


48 


53 , 


, 9 


113 


10 


Q7XSL3 


Q7xsl3 oryza sativ 


"^9 


48 


53 , 


. 9 


141 




Q28229 


Q28229 camelus dro 


33 


48 


53 , 


. 9 


420 


5 


017057 


017057 caenorhabdi 




48 


53 . 


. 9 


637 


5 


Q8 6KG7 


Q86kg7 dictyosteli 




47 S 


53 . 


. 4 


261 


5 


09 VAX 5 


Q9vax5 drosophila 




47 


52 , 


. 8 


141 


6 


Q29139 


Q29139 tragulus na 


o / 


47 


52 


. 8 


191 


16 


Q9A4I6 


Q9a4i6 caulobacter 


■J o 


47 


52 


. 8 


214 


16 


Q7UYU1 


Q7uyul rhodopirell 


O -7 


47 


52 


. 8 


270 


16 


Q9RMT3 


Q9rmt3 pseudomonas 


4 0 


47 


52 


. 8 


331 


10 


Q7XUZ3 


Q7xuz3 oryza sativ 


41 


47 


52 


.8 


349 


16 


Q88KN8 


Q88kn8 pseudomonas 


42 


47 


52 


.8 


400 


5 


Q961D1 


Q961dl drosophila 


43 


47 


52 


. 8 


482 


2 


Q7WUV5 


Q7wuv5 rickettsia 


44 


47 


52 


.8 


487 


10 


Q9FKI9 


Q9fki9 arabidopsis 


45 


47 


52 


,8 


487 


12 


Q805Y1 


Q8 05yl simian herp 



ALIGNMENTS 



RESULT 1 
Q27953 

ID Q27953 PRELIMINARY; 
AC Q27953; 

DT Ol-NOV-1996 (TrEMBLrel. 
DT Ol-NOV-1996 (TrEMBLrel. 
DT 01-DEC-2001 (TrEMBLrel. 



PRT; 141 AA. 
01, Created) 

01, Last sequence update) 
19, Last annotation update) 



DE B-casein (Fragment) . 

OS Balaenoptera physalus (Finback whale) (Coimnon rorqual) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla ; Cetacea; Mysticeti; 

OC Balaenopteridae; Balaenoptera. 

OX NCBI_TaxID=977 0; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=96364219; PubMed=8752 004 ; 

RA Gatesy J., Hayashi C. , Cronin M.A. , Arctander P.; 

RT "Evidence from milk casein genes that cetaceans are close relatives of 

RT hippopotamid artiodactyls , " ; 

RL Mol. Biol. Evol. 13:954-963(1996). 

DR EMBL; U53900; AAB08405.1; -. 

DR InterPro; IPR001588; Casein. 

DR Pfam; PF00363; caseins; 1. 

FT NON_TER 1 1 

FT NON_TER 141 141 

SQ SEQUENCE 141 AA; 15822 MW; 7C3EDEE32 0034513 CRC64; 



Query Match 93.3%; 
Best Local Similarity 93.3%; 
Matches 14; Conservative 



Score 83; DB 6; Length 141; 
Pred. No. 0.00014; 
0; Mismatches 1; Indels 



0; Gaps 



0; 



Qy 

Db 



1 MHQPPQPLPPTVMFP 15 

I I I I I I I I I I I III 
96 MHQPPQPLPPTPMFP 110 



RESULT 2 
Q27939 

ID Q27939 PRELIMIN/iRY; PRT; 141 AA. 

AC Q27939; 

DT Ol-NOV-1996 (TrEMBLrel. 01, Created) 

DT Ol-NOV-1996 (TrEMBLrel. 01, Last sequence update) 

DT 01-DEC-2001 (TrEMBLrel. 19, Last annotation update) 

DE B-casein (Fragment) . 

OS Alces alces (moose) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Cetartiodactyla; Ruminantia; Pecora; Cervoidea; 

OC Cervidae; Odocoileinae; Alces. 

OX NCBI_TaxID-9852; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=96364219; PubMed=8752 004 ; 

RA Gatesy J., Hayashi C, Cronin M.A. , Arctander P.; 

RT "Evidence from milk casein genes that cetaceans are close relatives of 

RT hippopotamid artiodactyls."; 

RL Mol. Biol. Evol. 13:954-963(1996). 

DR EMBL; U53896; AAB08403.1; -. 

DR InterPro; IPR001588; Casein. 

DR Pfam; PF00363; caseins; 1. 

FT NON_TER 1 1 

FT NON_TER 141 141 

SQ SEQUENCE 141 AA; 15763 MW; DC39F685 95C13C72 CRC64; 



Query Match 



91.0%; Score 81; DB 6; Length 141; 



Best Local Similarity 93.3%; Pred. No. 0.00027; 

Matches 14; Conservative 0; Mismatches 1; Indels 0; Gap 

Qy 1 MHQPPQPLPPTVMFP 15 

III I I I I I I I I I II 
Db 96 MHQTPQPLPPTVMFP 110 



RESULT 3 
Q27938 

ID Q27938 PRELIMINT^Y; PRT; 141 AA. 

AC Q27938; 

DT Ol-NOV-1996 (TrEMBLrel. 01, Created) 

DT Ol-NOV-1996 (TrEMBLrel. 01, Last sequence update) 

DT Ol-DEC-2001 (TrEMBLrel. 19, Last annotation update) 

DE B-casein (Fragment) . 

OS Antilocapra americana (Pronghorn) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Cetartiodactyla; Ruminantia; Pecora; Bovoidea; 

OC Antilocapridae; Antilocapra. 

OX NCBI_TaxID=9891; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=96364219; PubMed=8752004 ; 

RA Gatesy J., Hayashi C, Cronin M.A,, Arctander P.; 

RT "Evidence from milk casein genes that cetaceans are close relatives 

RT hippopotamid artiodactyls . " ; 

RL Mol. Biol. Evol. 13:954-963(199 6). 

DR EMBL; U53895; AAB08402.1; -. 

DR InterPro; IPR001588; Casein. 

DR Pfam; PF00363; caseins; 1. 

FT NON_TER 1 1 

FT NON_TER 141 141 

SQ SEQUENCE 141 AA; 15667 MW; F1112AF1617 119BB CRC64; 

Query Match 91,0%; Score 81; DB 6; Length 141; 

Best Local Similarity 93.3%; Pred. No. 0.00027; 

Matches 14; Conservative 0; Mismatches 1; Indels 0; Gap 

Qy 1 MHQPPQPLPPTVMFP 15 

III I I I I I I II I I I 
Db 96 MHQSPQPLPPTVMFP 110 



RESULT 4 
Q95L76 

ID Q95L76 PRELIMINARY; PRT; 223 AA. 

AC Q95L76; 

DT Ol-DEC-2001 (TrEMBLrel. 19, Created) 

DT Ol-DEC-2001 (TrEMBLrel. 19, Last sequence update) 

DT Ol-MAR-2002 (TrEMBLrel. 20, Last annotation update) 

DE Beta-casein precursor. 

GN CSN2 . 

OS Capra hircus (Goat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla; Ruminantia; Pecora; Bovoidea; 

OC Bovidae; Caprinae; Capra. 



ox NCBI_TaxID^9925; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Blood; 

RA Wang Q, , Huang Z., Chen M.J., Huang S.Z., Zeng Y.T.; 

RL Submitted (AUG-2 001) to the EMBL/GenBank/DDB J databases. 

DR EMBL; AF409096; AAK97639.1; 

DR InterPro; IPR001588; Casein. 

DR Pfam; PF00363; caseins; 1. 

KW Signal. 

FT SIGNAL 1 15 POTENTIAL. 

FT CHAIN 16 223 BETA-CASEIN. 

SQ SEQUENCE 223 AA; 24992 MW; 35A8BE1774 6A01DB CRC64; 



Query Match 91.0%; 
Best Local Similarity 93.3%; 
Matches 14; Conservative 



Score 81; DB 6; Length 223; 
Pred. No. 0.00042; 
0; Mismatches 1; Indels 



0; Gaps 



0; 



Qy 

Db 



1 MHQPPQPLPPTVMFP 15 

I I I I M I I I I I I I I 
159 MHQPPQPLSPTVMFP 173 



RESULT 5 
Q9BDG5 

ID Q9BDG5 PRELIMINARY; PRT; 146 AA. 

AC Q9BDG5; 

DT Ol-JUN-2001 (TrEMBLrel, 17, Created) 

DT Ol-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT Ol-OCT-2001 (TrEMBLrel. 18, Last annotation update) 

DE Beta casein B (Fragment). 

GN BCN B. 

OS Bos taurus (Bovine) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Cetartiodactyla ; Ruminantia; Pecora; Bovoidea; 

OC Bovidae; Bovinae; Bos. 

OX NCBI_TaxID=9913; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Mammary gland; 

RA Klotz A., Buchberger J., Krause I., Einspanier R. ; 

RT "Characterization of milk proteins."; 

RL Submitted (JAN-2 001) to the EMBL/GenBank/DDB J databases. 

DR EMBL; AJ296330; CAC37028.1; -. 

DR InterPro; IPR001588; Casein. 

DR Pfam; PF00363; caseins; 1. 

FT NON_TER 1 1 

FT NON_TER 14 6 14 6 

SQ SEQUENCE 146 AA; 16453 MW; 48A77E25740A9891 CRC64; 

Query Match 89.9%; Score 80; DB 6; Length 146; 

Best Local Similarity 93.3%; Pred. No. 0.00039; 

Matches 14; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTAmFP 15 

I I I I I I I I I I I I I I 
Db 97 MHQPHQPLPPTVMFP 111 



RESULT 6 
P79231 



ID 
AC 
DT 
DT 
DT 
DE 
OS 
OC 
OC 
OC 
OX 
RN 
RP 
RA 
RT 
RT 
RL 
DR 
DR 
DR 
FT 
FT 
SQ 



P79231 

P79231; 

Ol-MAY-1997 

Ol-MAY-1997 

Ol-DEC-2001 

Beta casein 



PRELIMINARY; 

(TrEMBLrel. 
(TrEMBLrel. 
(TrEMBLrel. 
(Fragment) . 



PRT; 



141 AA, 



03, Created) 

03, Last sequence update) 
19, Last annotation update) 



Physeter catodon (Sperm whale) (Physeter macrocephalus ) . 
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Cetartiodactyla; Cetacea; Odontoceti; 
Physeteridae ; Physeter. 
NCBI_TaxID=9755; 
[1] 

SEQUENCE FROM N.A. 
Gatesy J. ; 

"More DNA support for a Cetacea/Hippopotamidae clade: the blood 

clotting protein gene g-f ibrinogen . " ; 

Mol. Biol. Evol. 0:0-0(1997). 

EMBL; U86644; AAB47430.1; 

InterPro; IPR001588; Casein. 

Pfam; PF00363; caseins; 1. 

NON_TER 1 1 

NON_TER 141 141 

SEQUENCE 141 AA; 15867 MW; 02 67BA4DD8FEB9B2 CRC64 ; 



Query Match 87 . 6%; 

Best Local Similarity 86.7%; 
Matches 13; Conservative 



Score 78; DB 6; Length 141; 
Pred. No. 0.00072; 
0; Mismatches 2; Indels 



0; Gaps 



0; 



Qy 

Db 



15 



1 MHQPPQPLPPTVMFP 

I I I I I Mill III 
96 MHQPPHPLPPTPMFP 110 



RESULT 7 
Q28418 

ID Q28418 PRELIMINARY; PRT; 141 AA. 

AC Q28418; 

DT Ol-NOV-1996 (TrEMBLrel. 01, Created) 

DT Ol-NOV-1996 (TrEMBLrel. 01, Last sequence update) 

DT Ol-DEC-2001 (TrEMBLrel. 19, Last annotation update) 

DE B-casein (Fragment) . 

OS Giraffa camelopardalis (Giraffe) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Cetartiodactyla; Ruminantia; Pecora; Giraffoidea; 

OC Giraffidae; Giraffa. 

OX NCBI_TaxID=9894; 

RN [1] 

RP SEQUENCE FROM N.A, 

RX MEDLINE=96364219; PubMed=8752004 ; 

RA Gatesy J., Hayashi C, Cronin M.A., Arctander P.; 

RT "Evidence from milk casein genes that cetaceans are close relatives of 

RT hippopotamid artiodactyls . " ; 

RL Mol. Biol. Evol. 13:954-963(19 96). 



DR EMBL; U53897; AAB08412.1; 

DR InterPro; IPR001588; Casein. 

DR Pfam; PF00363; caseins; 1, 

FT NON_TER 1 1 

FT NON_TER 141 141 

SQ SEQUENCE 141 AA; 15712 MW; 



546DBF082A26BD73 CRC64; 



Query Match 84 . 3%; 

Best Local Similarity 86.7%; 
Matches 13; Conservative 



Score 75; DB 6; Length 141; 
Pred. No. 0.002; 
0; Mismatches 2; Indels 



0; Gaps 



0; 



Qy 

Db 



1 MHQPPQPLPPTVMFP 15 
III I I I I I I I I I I 
96 MHQSPQPLPPTVMLP 110 



RESULT 8 
Q28355 

ID Q28355 PRELIMINARY; PRT; 141 AA. 

AC Q28355; 

DT Ol-NOV-1996 (TrEMBLrel. 01, Created) 

DT Ol-NOV-1996 (TrEMBLrel. 01, Last sequence update) 

DT Ol-DEC-2001 (TrEMBLrel. 19, Last annotation update) 

DE B-casein (Fragment) . 

OS Delphinapterus leucas (Beluga whale) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla ; Cetacea; Odontoceti; 

OC Monodontidae; Delphinapterus. 

OX NCBI_TaxID=974 9; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=9 6364219; PubMed=8 752 004 ; 

RA Gatesy J., Hayashi C, Cronin M.A., Arctander P.; 

RT "Evidence from milk casein genes that cetaceans are close relatives of 

RT hippopotamid artiodactyls . " ; 

RL Mol. Biol. Evol. 13:954-963(1996). 

DR EMBL; U53899; AAB08408.1; 

DR InterPro; IPR001588; Casein. 

DR Pfam; PF00363; caseins; 1. 

FT NON_TER 1 1 

FT NON_TER 141 141 

SQ SEQUENCE 141 AA; 16113 MW; F2 61 9AEE8 8 8 64D5A CRC64; 



Query Match 73.0%; Score 65; DB 6; 

Best Local Similarity 73.3%; Pred. No. 0.053; 
Matches 11; Conservative 0; Mismatches 



Length 141; 
4; Indels 



0; Gaps 



0; 



Qy 

Db 



1 MHQPPQPLPPTVMFP 15 
I I I I I III III 
96 MHQPPHRFPPTPMFP 110 



RESULT 9 
Q28795 

ID Q2 8 7 95 PRELIMINARY; PRT; 72 AA. 

AC Q28795; 

DT Ol-NOV-1996 (TrEMBLrel. 01, Created) 



DT 
DT 
DE 
OS 

oc 
oc 
ox 

RN 
RP 
RX 
RA 
RT 
RT 
RL 
DR 
DR 
DR 
FT 
FT 
SQ 



Ol-NOV-1996 (TrEMBLrel. 01, Last sequence update) 
Ol-DEC-2001 (TrEMBLrel. 19, Last annotation update) 
B-casein (Fragment) . 

Tayassu tajacu (Collared peccary) (Pecari tajacu) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Cetartiodactyla ; Suina; Tayassuidae; Pecari. 
NCBI_TaxID=9829; 
[1] 

SEQUENCE FROM N.A. 

MEDLINE-9 63642 19; PubMed=87520 04 ; 

Gatesy J., Hayashi Cronin M.A., Arctander P.; 

"Evidence from milk casein genes that cetaceans are close relatives 
hippopotamid artiodactyls . " ; 
Mol. Biol. Evol. 13:954-963(1996). 
EMBL; U53903; AAB08417.1; -. 
InterPro; IPR001588; Casein. 
Pfam; PF00363; caseins; 1. 
NON_TER 1 1 

NON_TER 72 72 

SEQUENCE 72 AA; 8039 MW; 



0EAB068949A1112E CRC64; 



Query Match 70. 8%; 

Best Local Similarity 73.3%; 
Matches 11; Conservative 



Score 63; DB 6; 
Pred. No. 0.054; 
1; Mismatches 



Length 72; 
3; Indels 



0; Gaps 



Qy 

Db 



1 MHQPPQPLPPTVMFP 15 

III I I I : I I III 
51 MHQVPQPIPRTPMFP 65 



RESULT 
Q9H5E0 



10 



ID 
AC 
DT 
DT 
DT 
DE 
OS 
OC 
OC 
OX 
RN 
RP 
RC 
RA 
RA 
RA 
RA 
RA 
RT 
RL 
DR 

SQ 



Created) 

Last sequence update) 
Last annotation update) 



Craniata; Vertebrata; Euteleostomi; 
Catarrhini; Hominidae; Homo. 



Q9H5E0 PRELIMINARY; PRT; 412 AA. 

Q9H5E0; 

Ol-MAR-2001 (TrEMBLrel. 16, 
01-MAR~2001 (TrEMBLrel. 16, 
Ol-MAR-2003 (TrEMBLrel. 23, 
Hypothetical protein FLJ23531 
Homo sapiens (Human) . 
Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
NCBI_TaxID=9606; 
[1] 

SEQUENCE FROM N.A. 
TISSUE^Lung; 

Kawakami T., Noguchi S., Itoh T., Shigeta K., Senba T., Matsumura K. 
Nakajima Y., Mizuno T., Morinaga M. , Tanigami A., Fujiwara T., Ono T 
Yamada K., Fujii Y., Ozaki K. , Hirao M. , Ohmori Y. , Ota T., Suzuki Y 
Obayashi M. , Nishi T., Shibahara T., Tanaka T., Nakamura Y. , 
Isogai T., Sugano S.; 

"NEDO human cDNA sequencing project."; 

Submitted (AUG-2000) to the EMBL/GenBank/DDBJ databases. 
EMBL; AK027184; BAB15686.1; -. 
Hypothetical protein. 

SEQUENCE 412 AA; 46539 MW; D72A6D830BB12B94 CRC64; 



Query Match 



62.9%; Score 56; DB 4; Length 412; 



Best Local Similarity 69.2%; Pred. No, 2.9; 

Matches 9; Conservative 1; Mismatches 3; Indels 0; Gap 



Qy 

Db 



3 QPPQPLPPTVMFP 15 

I I I I I I I : I I 
15 QPPQPAPPPPLFP 27 



AC 
DT 
DT 
DT 
DE 
OS 
OC 
OC 
OX 
RN 
RP 
RX 
RA 
RT 
RT 
RL 
DR 
DR 
DR 
FT 
FT 
SQ 



PRELIMINARY; 



RESULT 11 
Q28442 
ID Q28442 
Q28442; 

Ol-NOV-1996 (TrEMBLrel 
Ol-NOV-1996 (TrEMBLrel 
Ol-DEC-2 001 (TrEMBLrel 
B-casein (Fragment) . 
Hippopotamus amphibius 



PRT; 



117 AA. 



01, Created) 

01, Last sequence update) 
19, Last annotation update) 



(Hippopotamus) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Cetartiodactyla ; Hippopotamidae; Hippopotamus. 
NCBI_TaxID=9833; 
[1] 

SEQUENCE FROM N.A. 

MEDLINE=96364219; PubMed-8752 004 ; 

Gatesy J., Hayashi C, Cronin M.A., Arctander P.; 

"Evidence from milk casein genes that cetaceans are close relatives 

hippopotamid artiodactyls . " ; 

Mol. Biol. Evol. 13:954-963(1996). 

EMBL; U53901; AAB08414.1; -. 

InterPro; IPR001588; Casein. 

Pfam; PF00363; caseins; 1. 

NON_TER 1 1 

NON_TER 117 117 

SEQUENCE 117 AA; 13179 MW; A50C6D9F6A126FC4 CRC64; 



Query Match 61. 8%; 

Best Local Similarity 66.7%; 
Matches 10; Conservative 



Score 55 ; DB 6; 
Pred. No. 1.2; 
1; Mismatches 



Length 117; 
4; Indels 



0 ; Gap 



Qy 

Db 



1 MHQPPQPLPPTVMFP 15 

II III 11:111 
68 MHPXSQPLSPTLMFP 82 



RESULT 12 
Q29136 
ID 
AC 
DT 
DT 
DT 
DE 
OS 



OC 
OC 
OX 
RN 
RP 



Created) 

Last sequence update) 
Last annotation update) 



Q29136 PRELIMINARY; PRT; 141 AA. 

Q29136; 

Ol-NOV-1996 (TrEMBLrel. 01, 
Ol-NOV-1996 (TrEMBLrel. 01, 
Ol-DEC-2001 (TrEMBLrel. 19, 
B-casein (Fragment) . 

Tapirus indicus (Asiatic tapir) (Malayan tapir) . 
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Perissodactyla; Tapiridae; Tapirus. 
NCBI_TaxID=9802; 
[1] 

SEQUENCE FROM N.A. 



RX MEDLINE=96364219; PubMed=87 52 0 04 ; 

RA Gatesy J., Hayashi C, Cronin M.A. , Arctander P.; 

RT "Evidence from milk casein genes that cetaceans are close relatives of 

RT hippopotamid artiodactyls . " ; 

RL Mol. Biol. Evol. 13:954-963(199 6). 

DR EMBL; U53904; AAB08419.1; 

DR, InterPro; IPR001588; Casein. 

DR Pfam; PF00363; caseins; 1. 

FT NON_TER 1 1 

FT NON_TER 141 141 

SQ SEQUENCE 141 AA; 15952 MW; 2F6ECAA9B3123B4B CRC64; 



Query Match 60.7%; 
Best Local Similarity 66.7%; 
Matches 10; Conservative 



Score 54; DB 6; 
Pred. No. 2; 
1; Mismatches 



Length 141; 
4; Indels 



0; Gaps 



0; 



Qy 

Db 



96 



MHQPPQPLPPTVMFP 15 
I I I I I I I I : I I 
MHQVPQPLHQTLMLP 110 



RESULT 13 
Q29151 

ID Q29151 PRELIMINTU^Y; PRT; 145 AA. 

AC Q29151; 

DT Ol-NOV-1996 (TrEMBLrel. 01, Created) 

DT Ol-NOV-1996 (TrEMBLrel. 01, Last sequence update) 

DT Ol-DEC-2001 (TrEMBLrel. 19, Last annotation update) 

DE B-casein (Fragment) , 

OS Uncia uncia (Snow leopard) (Panthera uncia) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Carnivora; Fissipedia; Felidae; Uncia. 

OX NCBI_TaxID=29064; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=96364219; PubMed=8752004 ; 

RA Gatesy J., Hayashi C, Cronin M.A. , Arctander P.; 

RT "Evidence from milk casein genes that cetaceans are close relatives of 

RT hippopotamid artiodactyls."; 

RL Mol, Biol. Evol. 13:954-963(1996). 

DR EMBL; U53906; AAB08422.1; -. 

DR InterPro; IPR001588; Casein. 

DR Pfam; PF00363; caseins; 1. 

FT NON_TER 1 1 

FT NON_TER 14 5 14 5 

SQ SEQUENCE 145 AA; 16379 MW; CAD027AAEE62 012A CRC64; 



Query Match 59.6%; 
Best Local Similarity 66.7%; 
Matches 10; Conservative 



Score 53; DB 6; 
Pred. No. 2.9; 
0; Mismatches 



Length 145; 
5; Indels 



0; Gaps 



0; 



Qy 

Db 



1 MHQPPQPLPPTVMFP 15 

I I I I I I I I I I 
100 MHQIPQALPQTTMLP 114 



RESULT 14 



Q8TC66 

ID Q8TC66 PRELIMINARY; PRT; 421 AA. 

AC Q8TC66; 

DT Ol-JUN-2002 (TrEMBLrel. 21, Created) 

DT Ol-JUN-2002 (TrEMBLrel. 21, Last sequence update) 

DT Ol-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Similar to hypothetical protein DKFZp547H236 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE^Testis; 

RA Strausberg R. ; 

RL Submitted (MAR-2002) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; BC025404; AAH25404.1; 

DR GO; GO: 0005634; Crnucleus; lEA. 

DR GO; GO: 0003700; F : transcription factor activity; lEA. 

DR GO; GO: 0005215; F : transporter activity; lEA. 

DR GO; GO: 0006355; P:regulation of transcription, DNA-dependent ; lEA. 

DR GO; GO: 0006810; P: transport; lEA. 

DR InterPro; IPR001356; Homeobox. 

DR InterPro; IPR000566; Lipocln_cytFABP . 

DR Pfam; PF00046; homeobox; 1. 

DR ProDom; PDOOOOlO; Homeobox; 1. 

DR SMART; SM00389; HOX; 1. 

DR PROSITE; PS50071; H0ME0B0X_2; 1. 

DR PROSITE; PS00213; LIPOCALIN; 1. 

KW Hypothetical protein, 

SQ SEQUENCE 421 AA; 46174 MW; 3622AAB62D325 61F CRC64; 

Query Match 59.6%; Score 53; DB 4; Length 421; 

Best Local Similarity 88.9%; Pred. No. 8; 

Matches 8; Conservative 1; Mismatches 0; Indels 0; Gaps 

Qy 2 HQPPQPLPP 10 

1:1111111 
Db 38 HRPPQPLPP 4 6 



RESULT 15 






Q9N2G8 






ID 


Q9N2G8 PRELIMINARY; 


PRT; 250 AA. 




AC 


Q9N2G8; 






DT 


Ol-OCT-2000 (TrEMBLrel . 


15, Created) 




DT 


Ol-OCT-2000 (TrEMBLrel . 


15, Last sequence update) 




DT 


Ol-JUN-2002 (TrEMBLrel. 


21, Last annotation update) 




DE 


Beta-casein . 






OS 


Canis familiaris (Dog) . 






OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 


Euteleostomi ; 


OC 


Mammalia; Eutheria; Carnivora; Fissipedia; Canidae; 


Canis . 


OX 


NCBI TaxID=9615; 






RN 


[1] 






RP 


SEQUENCE FROM N.A. 






RX 


MEDLINE=2 05412 90; PubMed=11092743 ; 




RA 


Watanabe M. , Sugano S., 


Togashi T., Imai J., Uchida 


K. , Yamaguchi R 



RA Tateyama S . ; 

RT "Molecular cloning and phylogenetic analysis of canine beta-casein."; 

RL DNA Seq. 11:295-300(2000). 

DR EMBL; AB035080; BAA95931.1; 

DR InterPro; IPR001588; Casein. 

DR Pfam; PF00363; caseins; 1. 

DR PROSITE; PS00306; CASEIN_ALPHA_BETA; 1. 

SQ SEQUENCE 250 AA; 28401 MW; 1D58391E7BF97ED8 CRC64; 

Query Match 58.4%; Score 52; DB 6; Length 250; 

Best Local Similarity 76.9%; Pred. No. 6.8; 

Matches 10; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVM 13 

III I I I I I I I 
Db 178 MHQIPQPLPQTPM 190 



Search completed: August 24, 2004, 15:51:01 
Job time : 49.3433 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



August 24, 2004, 14:57:04 ; Search time 8.0597 Seconds 

(without alignments) 
96.908 Million cell updates/sec 

US-09-641-801-34 
89 

1 MHQPPQPLPPTVMFP 15 
BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 141681 seqs, 52070155 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



141681 



Database 



SwissProt 42:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result 
No. 


Score 


Query 
Match 


Length 


DB 


ID 


Description 


1 


89 


100. 


,0 


222 


1 


CASB_SHEEP 


P11839 


ovis aries 


2 


89 


100, 


,0 


224 


1 


CASB_BUBBU 


Q9tsi0 


bubalus bub 


3 


81 


91. 


,0 


222 


1 


CASB_CAPHI 


P33048 


capra hircu 


4 


80 


89. 


.9 


224 


1 


CASB_BOVIN 


P02666 


bos taurus 


5 


55 


61. 


.8 


232 


1 


CASB_PIG 


P39037 


sus scrofa 


6 


53 


59. 


.6 


382 


1 


MEI3_HUMAN 


Q99687 


homo sapien 


7 


53 


59. 


.6 


5262 


1 


MLL2 HUMAN 


014686 


homo sapien 


8 


50 


56. 


.2 


232 


1 


CASB_CAMDR 


Q9tvd0 


camelus dro 


9 


49 


55. 


.1 


1386 


1 


ZAP3_M0USE 


Q9r0i7 


mus musculu 


10 


49 


55, 


.1 


1822 


1 


ZAP3 HUMAN 


P49750 


homo sapien 


11 


47 


52, 


.8 


226 


1 


CASB_HUMAN 


P05814 


homo sapien 


12 


47 


52, 


.8 


429 


1 


KNIR__DROME 


P10734 


drosophila 


13 


47 


52, 


.8 


479 


1 


ATIN_HSV1F 


P04486 


herpes simp 


14 


47 


52, 


.8 


490 


1 


ATIN HSVll 


P06492 


herpes simp 


15 


47 


52, 


.8 


490 


1 


ATIN HSV2H 


P23990 


herpes simp 


16 


46 


51 


.7 


417 


1 


IRX5_HUMAN 


P78411 


homo sapien 


17 


46 


51 


.7 


613 


1 


GCM2 DROME 


Q9vla2 


drosophila 



1 ft 


A f> 


51 . 


7 


690 


1 


BPLl YRAST 


P48445 


s biotin — p 


1 Q 


46 


51 . 


7 


915 


1 


PDB2 ARATH 


023078 


arabidopsis 


9 n 
z U 




SI 


7 


1094 


1 


S24C HUMAN 


P53992 


homo sapien 


9 1 


45.5 


51 . 


, 1 


103 


1 


SMS2 RANRI 


P87385 


rana ridibu 


9 9 


4 S 


sn 


6 


351 


1 


KLF2 RAT 


Q9et58 


rattus norv 


9 


45 


50 . 


, 6 


354 


1 


KLF2 MOUSE 


Q60843 


mus rausculu 


9 A 
Z ^ 


4 S 


s n 


• \j 


416 


1 


HXD3 HUMAN 


P31249 


homo sapien 


9 S 
z o 


4 S 


50 , 


, 6 


428 


1 


FXB2 MOUSE 


Q64733 


mus musculu 


9 

Z D 


4 ^ 


sn 


, 6 


431 


1 


HKTj^ ARATH 


P48000 


arabidopsis 


9 7 
Z / 


4 S 


so 




1274 


1 


EN AM MOUSE 


055196 


mus musculu 


9 P 
Z 0 


4 S 


50 , 


6 


2414 


1 


P300 HUMAN 


Q09472 


homo sapien 


9 Q 

z y 


A A ^ 


s n 




>J u o 


1 

X 




P97384 


mus TTlHSPUlU 

ILL k>i tJ X\\\JL *J V-^ ^ ^ 


O u 


4 4 S 


»J U 1 




505 


1 


ANXR HUMAN 


P50995 


homo sapi^n 


1 

o X 


44 


4 9 . 


^ 4 


292 


1 


Y0I4 CAREL 


Q09505 


caenorhabdi 


^ 9 
J Z 


4 4 


A Q 


4 


2 94 


1 


YOT9 TAEET. 


Q09507 


caenorhabdi 


s3 O 


4 4 


4 Q 


A 


419 
1 J. ^ 


]_ 


AT.F PETHY 


022621 

\j £^ \J -X 


petunia hyb 


^ A 


4 A 


4 Q 




4 1 
1 J. u 


1 


NH(S7 PAFFT. 


Q9xvv3 


caeno rhabdi 


"5 ^ 
o O 


A A 


4 Q 


4 


417 


]_ 


YQ4S MET J A 


Q58353 


me than o CO cc 


s3 D 


4 4 


4 Q 


4 
. *± 


4 SQ 


1 

X 


rE4fi HUMAN 


O9v330 


homo sapien 


o / 


A A 


A Q 


. *i 


f^9 Q 


1 

A. 


RA91 yENT.A 


093310 


xenoniis Tae 




4 4 


4 Q 


A 


/ O D 


1 


nVT.9 HUMAN 

X' V XJ ^ l1\J 1 LrW* 


014641 


homo saoien 




4 4 


4 Q 


4 


J. O >J vJ 




PROS DROVT 


Q9u6al 


dros ophila 




4 ^ S 


4 fl 


Q 


231 


1 


ASCI MOUSE 


Q02067 


mus musculu 


41 


43.5 


48. 


.9 


233 


1 


ASC1~RAT 


P19359 


rattus norv 


42 


43.5 


48, 


.9 


510 


1 


DHAF_VIBHA 


Q56694 


vibrio harv 


43 


43.5 


48, 


. 9 


1294 


1 


RRPO_WCMVM 


P09498 


white clove 


44 


43 


48, 


.3 


178 


1 


Y5 61_CHLPN 


Q9z7z2 


chlamydia p 


45 


43 


48, 


.3 


309 


1 


N075 SOYBN 


P08297 


glycine max 



ALIGNMENTS 



RESULT 1 


GASB_ 


SHEEP 


ID 


CASB SHEEP STANDAJ^D; PRT; 222 AA. 


AC 


P11839; 


DT 


Ol-OCT-1989 (Rel. 12, Created) 


DT 


Ol-NOV-1995 (Rel. 32, Last sequence update) 


DT 


lO-OCT-2003 (Rel. 42, Last annotation update) 


DE 


Beta casein precursor. 


GN 


CSN2 . 


OS 


Ovis aries (Sheep) . 


oc 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 


oc 


Mammalia; Eutheria; Cetartiodactyla; Ruminantia; Pecora; Bovoidea; 


oc 


Bovidae; Caprinae; Ovis. 


ox 


NCBI TaxID-9940; 


RN 


[1] 


RP 


SEQUENCE FROM N.A. 


RX 


MEDLINE=89375530; PubMed=25 058 62 ; 


RA 


Provot C, Persuy M.A., Mercier J.-C; 


RT 


"Complete nucleotide sequence of ovine beta-casein cDNA: 


RT 


inter-species comparison. " ; 


RL 


Biochimie 71:827-832(1989). 


RN 


[2] 


RP 


SEQUENCE FROM N.A. 


RX 


MEDLINE=95197013; PubMed=789017 4 ; 



RA Provot Persuy Mercier J.-C; 

RT "Complete sequence of the ovine beta-casein-encoding gene and 

RT interspecies comparison."; 

RL Gene 154:259-2 63(1995). 

RN [3] 

RP SEQUENCE OF 16-222. 

RX MEDLINE=80046695; PubMed-4 992 02 ; 

RA Richardson B.C., Mercier J.-C; 

RT "The primary structure of the ovine beta-caseins."; 

RL Eur. J. Biochem. 99:285-297(197 9). 

CC FUNCTION: Important role in determination of the surface 

CC properties of the casein micelles. 

CC -!- SUBCELLULAR LOCATION: Secreted. 

CC -!- TISSUE SPECIFICITY: Mammary gland specific. Secreted in milk. 
CC SIMILARITY: Belongs to the beta-casein family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

CC 

DR EMBL; X16482; CAA34502.1; -. 

DR EMBL; X79703; CAA56139.1; -. 

DR PIR; A32979; A32979. 

DR InterPro; IPR001588; Casein. 

DR Pfam; PF00363; caseins; 1. 

DR PROSITE; PS00306; CASEIN_ALPHA_BETA; 1. 

KW Milk; Phosphorylation; Glycoprotein; Signal. 



FT 


SIGNAL 


1 


15 






FT 


CHAIN 


16 


222 


BETA CASEIN. 




FT 


MOD_RES 


30 


30 


PHOSPHORYLATION 


(POTENTIAL) 


FT 


MOD_RES 


32 


32 


PHOSPHORYLATION 


(POTENTIAL) 


FT 


MOD RES 


33 


33 


PHOSPHORYLATION 


(POTENTIAL) 


FT 


MOD_RES 


34 


34 


PHOSPHORYLATION 


(POTENTIAL) 


FT 


CONFLICT 


70 


70 


A -> T (IN REF. 


3) . 


FT 


CONFLICT 


82 


82 


P -> A (IN REF. 


3) . 


SQ 


SEQUENCE 


222 AA; 


24E 


n5 MW; 061B4424DCB49EB1 CRC64; 



Query Match 100.0%; 
Best Local Similarity 100.0%; 
Matches 15; Conservative 0; 



Score 89; DB 1; Length 222; 
Pred. No. 6,5e-05; 

Mismatches 0; Indels 



0; Gaps 



0; 



Qy 



Db 



1 MHQPPQPLPPTVMFP 15 

M I I I I I I I I I I I I I 

159 MHQPPQPLPPTVMFP 173 



RESULT 2 
CASB_BUBBU 

ID CASB_BUBBU STANDARD; PRT; 224 J\A, 

AC Q9TSI0; 062824; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT lO-OCT-2003 (Rel. 42, Last annotation update) 



DE 
GN 
OS 

oc 
oc 
oc 
ox 

RN 
RP 
RC 
RA 
RT 
RL 
RN 
RP 
RC 
RA 
RT 
RL 
CC 

cc 

CC 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 



Beta casein precursor. 
CSN2 . 

Bubalus bubalis (Domestic water buffalo) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Cetartiodactyla ; Ruminantia; Pecora; Bovoidea; 
Bovidae; Bovinae; Bubalus. 
NCBI_TaxID=89462; 
[1] 

SEQUENCE FROM N.A. 

TISSUE^Mammary gland; 

Klotz A., Krause I., Einspanier R. ; 

"Isolation of mRNA from buffalo milk."; 

Submitted (MAY-1998) to the EMBL/GenBank/DDBJ databases. 
[2] 

SEQUENCE FROM N.A. 
TISSUE=Mammary gland; 
Das P. , Garg L.C. ; 

"cDNA cloning and sequencing of beta-casein gene in B. bubalis."; 
Submitted (OCT-1998) to the EMBL/GenBank/DDBJ databases. 

FUNCTION: Important role in determination of the surface 
properties of the casein micelles (By similarity) . 
SUBCELLULAR LOCATION: Secreted. 

TISSUE SPECIFICITY: Mammary gland specific. Secreted in milk. 
SIMILARITY: Belongs to the beta-casein family. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 



CC 














DR 


EMBL; AJ005165; 


CAA06408.1; - 






DR 


EMBL; AJ005432; 


CAA06535. 1; - 






DR 


InterPro; 


IPR00158 


8; Casein. 






DR 


Pfam; PF00363; caseins; 1. 






DR 


PROSITE; 


PS00306; 


CASEIN_ALPHA__BETA; 1 . 




KW 


Milk; Phosphorylation; Glycoprotein; Signal. 




FT 


SIGNAL 


1 




15 


BY SIMILARITY. 




FT 


CHAIN 


16 




224 


BETA CASEIN. 




FT 


MOD_RES 


30 




30 


PHOSPHORYLATION (BY 


SIMILARITY) . 


FT 


MOD RES 


32 




32 


PHOSPHORYLATION (BY 


SIMILARITY) . 


FT 


MOD_RES 


33 




33 


PHOSPHORYLATION (BY 


SIMILARITY) . 


FT 


MOD_RES 


34 




34 


PHOSPHORYLATION (BY 


SIMILARITY) . 


FT 


MOD_RES 


50 




50 


PHOSPHORYLATION (BY 


SIMILARITY) . 


FT 


CARBOHYD 


70 




70 


0-LINKED ( GALNAC . . 


. ) (BY SIMILARITY) 


FT 


CARBOHYD 


72 




72 


0-LINKED (GALNAC. . 


. ) (BY SIMILARITY) 


FT 


CARBOHYD 


95 




95 


0-LINKED (GALNAC. . 


. ) (BY SIMILARITY) 


FT 


CARBOHYD 


183 




183 


0-LINKED ( GALNAC . . 


. ) (BY SIMILARITY) 


FT 


CONFLICT 


117 




117 


M -> T (IN REF. 2) . 




SQ 


SEQUENCE 


224 AA; 


25106 MW 


14FD3687DD17C5A9 


CRC64; 



Query Match 100.0%; Score 89; DB 1; Length 224; 

Best Local Similarity 100.0%; Pred. No. 6.5e-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 



0; 



Qy 1 MHQPPQPLPPTVMFP 15 

I I I I I I I I I I I I I I I 
Db 159 MHQPPQPLPPTVMFP 173 



RESULT 3 
CASB_CAPHI 

ID CASB_CAPHI STANDARD; PRT; 222 AA. 

AC P33048; 

DT Ol-OCT-1993 (Rel. 27, Created) 

DT Ol-OCT-1993 (Rel. 27, Last sequence update) 

DT lO-OCT-2003 (Rel. 42, Last annotation update) 

DE Beta casein precursor, 

GN CSN2 . 

OS Capra hircus (Goat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

00 Mammalia; Eutheria; Cetartiodactyla ; Ruminantia; Pecora; Bovoidea; 

OC Bovidae; Caprinae; Capra. 

OX NCBI_TaxID=9925; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN^Saanen; TISSUE=Blood; 

RX MEDLINE=93077039; PubMed=144 6822 ; 

RA Roberts B., Ditullio P., Vitale J,, Hehir K., Gordon K.; 

RT "Cloning of the goat beta-casein-encoding gene and expression in 

RT transgenic mice."; 

RL Gene 121:255-262(1992). 

CC -!- FUNCTION: Important role in determination of the surface 
CC properties of the casein micelles. 

CC -!- SUBCELLULAR LOCATION: Secreted. 

CC -!- TISSUE SPECIFICITY: Mammary gland specific. Secreted in milk. 

CC -!- SIMILARITY: Belongs to the beta-casein family. 

CC ■ 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; M90561; AAA30906.1; -. 

DR EMBL; M90556; AAA30906.1; JOINED. 

DR EMBL; M90557; AAA30906.1; JOINED. 

DR EMBL; M90558; AAA30906.1; JOINED. 

DR EMBL; M90560; AAA30906.1; JOINED. 

DR PIR; JC1384; JC1384. 

DR InterPro; IPR001588; Casein. 

DR Pfam; PF00363; caseins; 1. 

DR PROSITE; PS00306; CASEIN_ALPHA_BETA; FALSE_NEG. 

KW Milk; Phosphorylation; Glycoprotein; Signal. 



FT 


SIGNAL 


1 


15 








FT 


CHAIN 


16 


222 


BETA CASEIN. 






FT 


MOD_RES 


30 


30 


PHOSPHORYLATION 


(BY 


SIMILARITY) 


FT 


MOD RES 


32 


32 


PHOSPHORYLATION 


(BY 


SIMILARITY) 


FT 


MOD RES 


33 


33 


PHOSPHORYLATION 


(BY 


SIMILARITY) 


FT 


MOD RES 


34 


34 


PHOSPHORYLATION 


(BY 


SIMILARITY) 



FT MOD RES 50 50 PHOSPHORYLATION (BY SIMILARITY) . 

SQ SEQUENCE 222 AA; 24865 MW; 9 6AE1774 6A01CD0 5 CRC64; 



Query Match 91.0%; Score 81; DB 1; Length 222; 

Best Local Similarity 93.3%; Pred. No. 0.00069; 

Matches 14; Conservative 0; Mismatches 1; Indels 0; Gaps 

Qy 1 MHQPPQPLPPTVMFP 15 

I I I I I I I I I I I I I I 
Db 159 MHQPPQPLSPTVMFP 173 



RESULT 4 
CASB_BOVIN 

ID CASB_BOVIN STANDARD; PRT; 22 4 AA. 

AC P02666; 

DT 21-JUL-1986 (Rel. 01, Created) 

DT Ol-MAR-1989 (Rel. 10, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Beta casein precursor. 

GN CSN2 . 

OS Bos taurus (Bovine) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Cetartiodactyla; Ruminantia; Pecora; Bovoidea; 

OC Bovidae; Bovinae; Bos. 

OX NCBI_TaxID=9913; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Baev A.A., Smirnov I.K., Gorodetsky S.I.; 

RT "Primary structure of bovine beta-casein cDNA."; 

RL Mol. Biol. (Mosk) 21:214-222(1987). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=88188 989; PubMed=2833669; 

RA Stewart A.F., Bonsing J., Beattie C.W., Shah F., Willis I.M., 

RA Mackinlay A.G.; 

RT "Complete nucleotide sequences of bovine alpha S2- and beta-casein 

RT cDNAs: comparisons with related sequences in other species."; 

RL Mol. Biol. Evol. 4:231-2 41(1987). 

RN [3] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=90147279; PubMed=32 7138 4 ; 

RA Bonsing J., Ring J.M., Stewart A.F., Mackinlay A.G.; 

RT "Complete nucleotide sequence of the bovine beta-casein gene."; 

RL Aust. J. Biol. Sci. 41:527-537(1988). 

RN [4] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=87128158; PubMed=38 14 153 ; 

RA Jimenez-Flores R. , Kang Y.C., Richardson T.; 

RT "Cloning and sequence analysis of bovine beta-casein cDNA."; 

RL Biochem. Biophys . Res. Commun . 142:617-621(1987). 

RN [5] 

RP SEQUENCE FROM N.A. (VT^J^IANT A3) . 

RC TISSUE=Mammary gland; 

RX MEDLINE=94 0 68 382; PubMed=82 4 8 1 00 ; 

RA Simons G., van den Heuvel W., Reynen T., Frijters A., Rutten G., 

RA Slangen C.J., Groenen M., de Vos W.M., Siezen R.J.; 



RT "Overproduction of bovine beta-casein in Escherichia coli and 

RT engineering of its main chymosin cleavage site."; 

RL Protein Eng. 6:763-77 0(1993). 

RN [6] 

RP SEQUENCE OF 16-224 (VARIANT A2 ) . 

RX MEDLINE-88152252; PubMed=3278 933 ; 

RA Carles C, Huet J.-C, Ribadeau-Dumas B.; 

RT "A new strategy for primary structure determination of proteins: 

RT application to bovine beta-casein."; 

RL FEES Lett. 229:265-272(1988). 

RN [7] 

RP SEQUENCE OF 16-224 (VARIANT A2 ) . 

RX MEDLINE=72233212; PubMed=4557764 ; 

RA Ribadeau-Dumas B., Brignon G, , Grosclaude F., Mercier J.-C; 

RT "Primary structure of bovine beta casein. Complete sequence."; 

RL Eur. J. Biochem. 25:505-514(1972). 

RN [8] 

RP VARIANTS Al; B AND C. 

RX MEDLINE=72214259; PubMed=5064 450 ; 

RA Grosclaude F,, Mahe M.-F., Mercier J.-C, Ribadeau-Dumas B.; 

RT "Characterization of genetic variants of alpha-Sl and beta bovine 

RT caseins . "; 

RL Eur, J. Biochem. 26:32 8-337(1972). 

RN [9] 

RP SEQUENCE OF 118-124 (VARIANT A3) , 

RX MEDLINE=71252171; PubMed=4997616; 

RA Ribadeau-Dumas B., Grosclaude F,, Mercier J.-C; 

RT "Localization in the peptide chain of bovine beta casein of the 

RT His-Gln substitution differentiating the A2 and A3 genetic 

RT variants."; 

RL C R, Acad, Sci,, D, Sci. Nat. 270:2369-2372(197 0). 

RN [10] 

RP SEQUENCE OF 4 8-63 (VARIANT E) . 

RX MEDLINE=75005247; PubMed=4411121; 

RA Grosclaude F, , Mahe M,-F., Voglino G,-F.; 

RT "The beta E variant and the phosphorylation code of bovine caseins."; 

RL FEBS Lett. 4 5:3-5(1974). 
RN [11] 

RP SEQUENCE OF 68-105 FROM N.A. 

RX MEDLINE=85155504; PubMed=6397405 ; 

RA Ivanov V.N,, Kershulite D.R., Bayev A.A. , AJchundova A,A. , 

RA Sulimova G.E., Judinlcova E,S., Gorodetsky S.I.; 

RT "Identification of bacterial clones encoding bovine caseins by direct 

RT immunological screening of the cDNA library."; 

RL Gene 32:381-388(1984). 
RN [12] 

RP SEQUENCE OF 68-95 FROM N.A, 

RX MEDLINE=8 6014 005; PubMed=3900695 ; 

RA Ivanov V.N,, Kershulite D,R., Bayev A.A. , Akhundova A, A,, 

RA Silimova G.E,; 

RT "Identification of bacterial clones coding for bovine caseins by 

RT direct immunologic screening of the cDNA library."; 

RL Mol. Biol. (Mosk) 19:955-963(1985). 
RN [13] 

RP SEQUENCE OF 18-57 FROM N,A., AND SEQUENCE OF 16-224 (VARIANT H) . 

RX MEDLINE=2 0154 951; PubMed=10690361 ; 

RA Han S.K., Shin Y.C, Byun H.D.; 



RT "Biochemical, molecular and physiological characterization of a new 

RT beta-casein variant detected in Korean cattle."; 

RL Anim. Genet. 31:49-51(2000). 

RN [14] 

RP SEQUENCE OF 125-195 (VARIANTS Al AND G) . 

RA Dong C,, Ng-Kwai-Hang K.F.; 

RT "Characterization of a non-electrophoretic genetic variant of beta- 

RT casein by peptide mapping and mass spectrometric analysis."; 

RL Int. Dairy J. 8:967-972(1998). 

RN [15] 

RP SEQUENCE OF 160-171 (VARIANT F) . 

RX MEDLINE-96118672; PubMed=7496485; 

RA Visser S., Slangen C.J., Lagerwerf F.M., Van Dongen W.D., 

RA Haver kamp J. ; 

RT "Identification of a new genetic variant of bovine beta-casein using 

RT reversed-phase high-performance liquid chromatography and mass 

RT spectrometric analysis."; 

RL J. Chromatogr. A 711 : 141-150 ( 1995 ) . 

RN [16] 

RP SEQUENCE OF 170-184 FROM N.A. 

RX MEDLINE=83182023; PubMed=68 97 77 4 ; 

RA Willis I.M,, Stewart A.F., Caputo A., Thompson A.R., McKinlay A.G.; 

RT "Construction and identification by partial nucleotide sequence 

RT analysis of bovine casein and beta-lactoglobulin cDNA clones."; 

RL DNA 1:375-386(1982) . 

RN [17] 

RP C7VRB0HYDRATE- LINKAGE SITES. 

RX MEDLINE=85000478; PubMed=6148 101 ; 

RA Yan S.B. , Wold F. ; 

RT "Neoglycoproteins : in vitro introduction of glycosyl units at 

RT glutamines in beta-casein using transglutaminase."; 

RL Biochemistry 23:3759-3765(1984). 

CC -!- FUNCTION: Important role in determination of the surface 
CC properties of the casein micelles. 

CC -!- SUBCELLULAR LOCATION: Secreted. 

CC -!- TISSUE SPECIFICITY: Mammary gland specific. Secreted in milk. 

CC -!- POLYMORPHISM: Leu-152 is present in the variants F and G; Gln-190 
CC and Glu-210 are present in the variant H. The sequence shown is 

CC the A2 variant. 

CC SIMILARITY: Belongs to the beta-casein family. 

CC -!- DATABASE: NAME=Protein Spotlight; 
CC NOTE=Issue 16 of November 2001; 

CC WWW="http: //www. expasy . org/ spotlight/ articles/ sptlt016.html" . 

CC 7 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
CC or send an email to licenseliisb-sib . ch) . 

CC 

DR EMBL; M16645; AAA30480.1; 
DR EMBL; M15132; AAA30430.1; 
DR EMBL; K01087; AAA30481.1; -. 
DR EMBL; X06359; CAA29658.1; -. 
DR EMBL; M55158; AAA30431.1; -. 



DR 


EMBL; S67277; AAB29137 


.1; 




DR 


EMBL; AF104929; AAD09813.1; 






EMBL; AF104928; AAD09813.1; 


JOINED, 


DR 


EMBL; M64756; AAB59254 


.1; -. 




DR 


PIR; 1458 


73; KBB0A2 










InterPro; 


IPR001588 


; Casein, 






Pfam; PF00363; caseins 


; 1. 




DR 


PROSITE; 


PS00306; CASEIN ALPHA BETA; 1. 




Milk; Phosphorylation; 


Glycoprotein; Signal; Polymorphism. 


r 1 




± 








r 1 


r*U ATM 


± o 


Z Z 4 




D£j1J\ Ej X in - 


r i 


rlUU Kejo 
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J u 




nrL^O I: n.\Jt\l LirVl -L . 


E 1 




O Z 


o z 




irri^o IT nwr\ 1 lif-vl X \Jri * 


r -L 
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i: 1 




'J /t 






PHr^c; PWnP VT ATTOM 


C X 


MOD KHib 


D U 


D U 




PUPiQPUnPVT ATTOM / TM 'V/'APTANT Al VARTAMT 


r 1 










A9 ■^/'APTAMT A^ ^/'APTAUT R A/APTAMT V 












"V/APTAMT TT AAAPTAMT C AMTi VARTAMT H ^ 
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CARBOHYD 


/ u 


/ u 




n— T TKiT^'frn /pat map ^ /papttat ) 


n I 


LAHbOni U 


/ Z 


f z 




O—T TMlifPn /PATMAO ^ 
U XjXIn J\rjiJ ^ (j-MXjIN/IL. . . * ) * 


£ i 


UAKbUrl lU 


Q 






u XjXiNrvrjXJ \ o/vuiN/^*^ . > * ) • 


E 1 


CAKDOrl 1 JJ 


loo 


J. 0 o 






E 1 


VARIAN 1 


/I n 
4 U 


A n 




P — ^ C /TM \/'APTAMT' \ 


r 1 


VARIANT 


51 


51 




E -> K (IN VARIANT E) . 


r 1 


VARIANT 


52 


52 




E -> K (IN VARIANT C) . 


E 1 


VT^IANT 


82 


82 




P -> H (IN VARIANTS Al, B, C, F AND G) . 


E 1 


VARIANT 


103 


103 




L -> I (IN VARIANT H) . 


£ ± 


VARIANT 


121 


121 




H -> Q (IN VARIANT A3) . 


E i 


VAJ^IANT 


132 


132 




E ~> Q (IN VARITM^TS Al AND G) . 


r 1 


VARIANT 


137 


137 




S -> R (IN VARIANT B) . 


E 1 


VARI7\NT 


152 


153 




LP -> PL (IN VARIANTS Al AND H) . 


E i 


VARIANT 


153 


153 




P -> L (IN VARIANT G) . 


FT 


VARIANT 


167 


167 




P -> L (IN VARIANT F) . 


FT 


VARIANT 


190 


190 




Q -> E (IN VARIANTS Al AND G) . 


FT 


CONFLICT 


108 


108 




M -> L (IN REF. 4 AND 7) . 


FT 


CONFLICT 


210 


210 




E -> Q (IN REF. 4 AND 7) . 


FT 


CONFLICT 


215 


224 




PVRGPFPIIV -> DPSLLL (IN REF. 1) . 


SQ 


SEQUENCE 


224 AA; 


25107 MW 


; F0BBDD8148A238AE CRC64; 


Query Match 




8 


9.9%; 


Score 80; DB 1; Length 224; 



Best Local Similarity 93,3%; Pred, No. 0.00093; 

Matches 14; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMFP 15 

I I I I I I I I I I I I I I 
Db 159 MHQPHQPLPPTVMFP 173 



RESULT 5 
CASB_PIG 

ID CASB_PIG STANDARD; PRT; 232 AA. 

AC P39037; 

DT Ol-FEB-1995 (Rel. 31, Created) 

DT Ol-FEB-1995 (Rel. 31, Last sequence update) 

DT 10-OCT-2003 (Rel, 42, Last annotation update) 

DE Beta casein precursor. 

GN CSN2 . 



OS Sus scrofa (Pig) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla ; Suina; Suidae; Sus. 

OX NCBI_TaxID=9823; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE^Mammary gland; 

RX MEDLINE=92367961; PubMed=1503277 ; 

RA Alexander L.J., Seattle C.W.; 

RT "The sequence of porcine beta-casein cDNA."; 

RL Anim. Genet. 23:369-371(1992). 

RN [2] 

RP SEQUENCE OF 16-29. 

RC TISSUE=Milk; 

RX MEDLINE=22152288; PubMed=12162653 ; 

RA Kauf A.C.W., Kensinger R.S.; 

RT "Purification of porcine beta-casein, N-terminal sequence, 

RT quantification in mas title milk.'*; 

RL J. Anim. Sci. 8 0:1863-1870(2002). 

RN [3] 

RP CHAJ^CTERI ZATION . 

RC TISSUE=Milk; 

RX MEDLINE=80021173; PubMed=385058 ; 

RA Mulvihill D.M., Fox P.F.; 

RT "Isolation and characterization of porcine beta-casein."; 

RL Biochim. Biophys . Acta 578:317-324(1979). 

CC -!- FUNCTION: Important role in determination of the surface 
CC properties of the casein micelles. 

CC SUBCELLULAR LOCATION: Secreted. 

CC -!- TISSUE SPECIFICITY: Mammary gland specific. Secreted in milk. 

CC -!- SIMILARITY: Belongs to the beta-casein family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X54974; CAA38718.1; -. 

DR InterPro; IPR001588; Casein. 

DR Pfam; PF00363; caseins; 1. 

DR PROSITE; PS00306; CASEIN_ALPHA_BETA; FALSE_NEG. 

KW Milk; Phosphorylation; Glycoprotein; Signal. 



FT 


SIGNAL 


1 


15 








FT 


CHAIN 


16 


232 


BETA CASEIN. 






FT 


MOD_RES 


30 


30 


PHOSPHORYLATION 


(BY 


SIMILARITY) . 


FT 


MOD_RES 


32 


32 


PHOSPHORYLATION 


(BY 


SIMILARITY) . 


FT 


MOD_RES 


33 


33 


PHOSPHORYLATION 


(BY 


SIMILARITY) . 


FT 


MOD RES 


34 


34 


PHOSPHORYLATION 


(BY 


SIMILARITY) . 


FT 


CARBOHYD 


22 


22 


N-LINKED (GLCNAC 




. ) (POTENTIAL) 


SQ 


SEQUENCE 


232 AA; 


25949 


MW; 6284850F40F7365C CRC64; 



Query Match 61.8%; Score 55; DB 1; Length 232; 

Best Local Similarity 71.4%; Pred. No. 1.6; 

Matches 10; Conservative 1; Mismatches 3; Indels 0;, Gaps 



0; 



Qy 1 MHQPPQPLPPTAMF 14 

III I I I : I I II 
Db 158 MHQIPQPVPQTPMF 171 



RESULT 6 






NTTMAN 




T rt 
± u 


MFT^ TTTIMAM STANDARD • PRT ' 3 82 AA. 










DT 


lo— jUL^iyyy (Hex. oo^ ureatieaj 




JJ 1 


J. O — L'L-1 — ZUUX ^ r\cX • Jjo-oL- o ct^ut-iiL-t; u^iuca. i-^r y 




riTt 
Ul 


XU W^J. ^UUO ^r\C;X» rt^^ XldOL. CtiilHJL.ClL.XWli Li^^ACl / 




DE 


rlOrneODOX pxOUGxn r'ieXSo \incXbX xtrXciL-trU. pxUL-trXii ; ^ r x dyiucii u J • 




GN 


jyibioo UK MKLjZ . 




OS 


Homo sapiens (Human) . 






"U- o -i^T ry-^ +- 3 • "h n ^7 ^ a • P Vi r> r" 3 "h " r*'r;^Tt'i ;^ "h ;^ • r t" TPi "h ;^ * F'.l]1~*=»1 *=*0*^1~0Tm ; 
IljUJ^axyO Ud. / i L-clZi(w>cl^ ^XiL-/XLiciL.cl/ OXdllXCtL-Cl^ vc;j_L-c:uj_cii-ci/ J_iu.i.-czj.c;wi:3 / 






'Mr^i-mm "PlThVi^y-Ta" Pv-iTTi;3'|-fac;» P^'h^KTVlTTin * HOTTI "i Fl 1 H 3 I HOTflO 

i^idirima X X d / HiUUIicxXd/ ir xxiud uc^o / ^dUdXXii-LH-L/ liLjfiLLj-ii-Li^ci^ / iit^iLLw ■ 




OX 


ncdi laxiu— yuuo. 




RN 


r T 1 

[1] 




RP 


SEQUENCE FROM N.A. 




RC 


lib b Unj— rsram , 




RA 


BlumH., Bauersachs S., Mewes H.-W., Weil B., Wiemann S.; 




RL 


Submitted (JUN-2000) to the EMBL/GenBank/DDB J databases. 




RN 


[2] 




RP 


SEQUENCE OF 175-382 FROM N.A. 




RX 


MEDLINE-972 02105; PubMed=904 9632 ; 




RA 


Steelman S.^ Moskow J. J., Muzynski K. , North C, Druck T., 




RA 


Montgomery J.C, Huebner K, , Daar I.O., Buchberg A.M.; 




RT 


"Identification of a conserved family of Meisl-related homeobox 




RT 


genes , " ; 




RL 


Genome Res. 7:142-156(1997). 






-!- SUBCELLULAR LOCATION: Nuclear (Probable). 






SIMILARITY: Belongs to the TALE/MEIS homeobox family. 




/-I 

UL, 
UU 


-!- SIMILARITY: Contains 1 homeobox domain. 




This SWISS-PROT entry is copyright. It is produced through a collaboration 


L-U 


between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 


L-U 


the European Bioinf ormatics Institute. There are no restrictions on 


its 


UU 


use by non-profit institutions as long as its content is in no 


way 


L,L- 


modified and this statement is not removed. Usage by and for commercial 


L-Lv 


entities requires a license agreement (See http : //www. isb-sib , ch/announce/ 


uu 

UK 


or send an email to license@isb-sib . ch) . 




EMBL; AL359938; CAB95771.1; -. 




DR 


EMBL; U68385; AAB19195.1; -. 




DR 


HSSP; P40424; 1B72. 




DR 


TRANSFAC; T03412; 




DR 


Genew; HGNC:7002; MEIS3. 




DR 


GO; GO:0005634; C:nucleus; ISS. 




DR 


GO; GO:0003677; F:DNA binding; ISS. 




DR 


GO; GO: 0008283; P:cell proliferation; ISS. 




DR 


InterPro; IPR001356; Homeobox. 




DR 


Pfam; PF00046; homeobox; 1. 




DR 


ProDom; PDOOOOlO; Homeobox; 1. 




DR 


SMART; SM00389; HOX; 1. 




DR 


PROSITE; PS00027; HOMEOBOX 1; FALSE_NEG. 







PROSITE; 


PS50071; H0ME0B0X_2; 1. 


iVUV 


DNA-binding; Nuclear protein 


; Homeobox . 


FT 
r 1 


MON TFR 
IN WIN ± j_j r\ 


1 


1 




r 1 




91 7 

i!L ± / 


256 


SER/THR-RICH . 


r 1 




9 ^ Q 


^ \j \j 


ASP/GLU-RICH (ACIDIC) . 


TTT 




D 


O O J. 


HOMF.OROX f TALE-TYPE) . 


r i 


CONFLICT 


175 


176 


to^ -> Pp flN RFF. 2) . 


r 1 


CONFLICT 


209 


209 


M -> I (IN REF. 2) . 


FT 


CONFLICT 


^ rt *J 


245 


D -> V (IN REF. 2) . 


FT 


CONFLICT 






R -> P (IN REF. 2) . 


FT 


CONFLICT 


358 


358 


O -> F. ^TN REF 2) 


FT 


CONFLICT 


363 


367 


VRPPG -> FRAPA (IN REF. 2). 


FT 


CONFLICT 


371 


377 


MSLNLEG -> DEFGTRKE (IN REF 


SQ 


SEQUENCE 


382 AA; 


41821 MW 


; A2C11BE8061FB718 CRC64; 


Query Match 




59. 6%; 


Score 53; DB 1; Length 38: 


Best Local , 


Similarity 


88 . 9%; 


Pred. No. 4.8; 


Matches 


8; Conservative 


1; Mismatches 0; Indels 


Qy 


2 


HQPPQPLPP 
1 : 1 1 1 1 1 1 1 


10 




Db 


45 


HRPPQPLPP 


53 





RESULT 7 
MLL2_HUMAN 

ID MLL2_HUMAN STAND7VRD; PRT; 5262 AA. 

AC 014686; 014687; 

DT lO-OCT-2003 (Rel. 42, Created) 

DT lO-OCT-2003 (Rel. 42, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Myeloid/lymphoid or mixed-lineage leukemia protein 2 (ALLl-related 

DE protein) . 

GN MLL2 OR ALR. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORMS 1; 2 AND 3) . 

RX MEDLINE=9738 8474 ; PubMed=9247 308 ; 

RA Prasad R. , Zhadanov A.B., Sedkov Y., Bullrich F. , Druck T., 

RA Rallapalli R. , Yano T., Alder H., Croce CM., Huebner K. , Mazo A., 

RA Canaani E . ; 

RT "Structure and expression pattern of human ALR, a novel gene with 

RT strong homology to ALL-1 involved in acute leukemia and to Drosophil 

RT trithorax. "; 

RL Oncogene 15:549-560(1997). 

RN [2] 

RP INTERACTION WITH ASC-2/NCOA6 CONTAINING COMPLEX. 

RC TISSUE=Cervical carcinoma; 

RX MEDLINE=22371496; PubMed=124 82968 ; 

RA Goo Y.-H., Sohn Y.C., Kim D.-H., Kim S.-W., Kang M.-J., Jung D.-J., 

RA Kwak E., Barlev N.A., Berger S.L., Chow V.T., Roeder R.G., 

RA Azorsa D.O., Meltzer P.S., Suh P.-G., Song E.J., Lee K.-J., Lee Y.C. 

RA Lee J.W. ; 

RT "Activating signal cointegrator 2 belongs to a novel steady-state 



RT complex that contains a subset of trithorax group proteins."; 

RL Mol. Cell. Biol. 23:140-149(2003). 

CC FUNCTION: May be involved in transcriptional regulation. 

CC -!- SUBUNIT: Belongs to the ASC-2/NCOA6 complex (ASCOM) , which 

CC contains ASC-2/NCOA6^ the retinoblastoma-binding protein RBQ-3/ 

CC RBBP5, alpha- and beta-tubulins , the trithorax group proteins 

CC MLL2 and MLL3, and ASH2/ASCL2. 

CC SUBCELLULAR LOCATION: Nuclear (Probable). 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=3; 

CC Name=l; 

CC Isold=014686-1; Sequence^Displayed; 

CC Name=2 ; 

CC IsoId=014686-2; Sequence=VSP_008563 , VSP_008559; 

CC Name=3; 

CC IsoId=014686-3; Sequence-VSP_008560 ; 

CC -!~ TISSUE SPECIFICITY: Expressed in most adult tissues, including a 

CC variety of hematoipoietic cells, with the exception of the liver. 

CC -!- MISCELLANEOUS: This gene mapped to a chromosomal region involved 

CC in duplications and translocations associated with cancer. 

CC -!~ SIMILARITY: Belongs to the transcription factor trithorax family. 

CC -!- SIMILARITY: Contains 5 PHD-type zinc fingers. 

CC SIMILARITY: Contains 1 post-SET domain. 

CC -!- SIMILARITY: Contains 1 RING-type zinc finger. 

CC -!- SIMILARITY: Contains 1 SET domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) , 

CC 

DR EMBL; AF010403; AAC51734.1; -. 

DR EMBL; AF010404; AAC51735.1; -. 

DR PIR; T03454; T03454. 

DR PIR; T03455; T03455. 

DR Genew; HGNC:7133; MLL2 . 

DR MIM; 602113; 

DR GO; GO:0005634; C:nucleus; TAS . 

DR GO; GO: 0003700; F: transcription factor activity; TAS. 

DR GO; GO: 0007 048; P : oncogenesis ; TAS. 

DR GO; GO: 0006366; P : transcription from Pol II promoter; TAS. 

DR InterPro; IPR003889; FYrich_C. 

DR InterPro; IPR003888; FYrich_N. 

DR InterPro; IPR000910; HMG_12_box. 

DR InterPro; IPR003616; PostSET. 

DR InterPro; IPR006118; Recombinase. 

DR InterPro; IPR001214; SET. 

DR InterPro; IPR001965; Znf_PHD. 

DR InterPro; IPR001841; Znf_ring. 

DR Pfam; PF00628; PHD; 5. 

DR Pfam; PF00856; SET; 1. 

DR SMART; SM00542; FYRC; 1. 

DR SMART; SM00541; FYRN; 1. 

DR SMART; SM00398; HMG; 1. 



DR 


SMART; SM00249; PHD; 7. 






DR 


SMART; SM00508; PostSET; 


1. 




DR 


SMART; SM00184; RING; 3, 






DR 


SMART; SM00317; SET; 1. 






DR 


PROSITE; 


PS50868; 


POST SET; 


1. 


DR 


PROSITE; 


PS50280; 


SET; 1. 






DR 


PROSITE; 


PS01359; 


ZF_PHD_ 


1; 


5. 


DR 


PROSITE; 


PS50016; 


ZF_PHD_ 


2; 


5. 


DR 


PROSITE; 


PS50089; 


ZF_RING 


2; 


1. 


KW 


Nuclear 


protein; 


Transcription regulation; Coiled coil; Zinc-finger; 


KW 


Repeat ; 


Alternative splicing 


r IT OX yiiio rpnx sni . 


FT 


ZN FING 


9 9 
<i Z D 


Z / D 




t-LliJ 1 1 IT Hi X . 


FT 


ZN FING 


9 9 Q 


9 7 4 
Z / T 




"DTMr* TVDT? 

KXPJLj i 1 rrlLi . 


FT 


ZN FING 




^9 ^ 
oZ o 




rtlU 1 I r Hj Z . 


FT 


ZN FING 




1 1 ^ 
X X D O 




rrlL' 1 1 r iIj o , 


FT 


ZN FING 


1 1 ^9 


1 9 n 9 
XZ Uz 




irnu— i I r rj 4 . 


FT 


ZN FING 


1 9 9 Q 
X Z Z 


1 9 P /I 
XZ o ^ 




irrlU i 1 r 0 . 


FT 


DOMAIN 


O X Z X 


OZ 4 Z 






FT 


DOMAIN 


R9 /d 
Dz 4 D 


OZ OZ 




rUb I — bhji , 


FT 


DOMAIN 


9 "5 Q "7 

z o y / 


9 yl T £^ 
Z 4 o D 




UUXXiCjD UUXXi ^ rUi iIjN i XAXi J . 


FT 


DOMAIN 


9 7 0 0 
Z / O O 


9 Q n Q 

z cf u y 




UUXLcjU LUXL ( rOi hjN i XAL ) . 


FT 


DOMAIN 


z y / H 


n n 1 

o U U X 




UUXXitU LUXXi (FUlhjNiXALJ , 


FT 


DOMAIN 


oZ O O 


O O 4i Z 




L-UxhiLiU L-UXXi ^ rUi HjJN i XAXi J . 


FT 


DOMAIN 


Q /l Q "7 


O 4 / b 




CUXXilLU UUXL ^ rUi HjJN 1 XAXi J . 


FT 


DOMAIN 


^ f^9 1 
J OZ X 


^ 7 n 1 

o / U X 




UUXXiIIjJJ CUXXj [ rUi hJN i XAXi ) . 


FT 


DOMAIN 


'I Z DO 


4 9 P 7 
4 Z O / 




L-UXXiIjU LUXLi ^ rUl rjN 1 XAXj J . 


FT 


DOMAIN 


4 -5 " 


D D 0 




V ^ AA DT?DTAT'C! r\TP C/n ID D t/T) TP / 7\ 

ID A D KtriliAlo Ur b/ r~r~r — hi/ r — L/ A. 


FT 


REPEAT 


4 4 9 

*i T Z 


4 4^ 
T rl D 




1 

X > 


FT 


REPEAT 


4 D U 


4^4 
fi D 




9 

Z • 


FT 


REPEAT 


D 


4 7 




o 

•D ■ 


FT 


REPEAT 


4 Q 


son 

o u u 




4 > 


FT 


REPEIAT 


.-J U H 


o u o 




c. 


FT 


REPEAT 


521 


S9 S 
^ i. J 




u • 


FT 


REPEAT 


s s s 

J J -J 


S Q 

O O 13 




7 


FT 


REPEAT 


O O T 


-J D O 




« 

o • 


FT 


REPEAT 


S7 ^ 
o / o 


S77 




Q 

_? ■ 


FT 


REPEIAT 


582 


586 






FT 


REPEAT 


609 


613 




11 ■ 


FT 


REPEAT 


618 


622 




12 , 


FT 


REPEAT 


627 


631 




13 . 


FT 


REPEAT 


645 


649 




14 . 


FT 


REPEAT 


663 


667 




15. 


FT 


DOMAIN 


229 


326 




CYS-RICH. 


FT 


DOMAIN 


374 


922 




PRO-RICH. 


FT 


DOMAIN 


1015 


1053 




7VRG-RICH. 


FT 


DOMAIN 


1122 


1235 




CYS-RICH. 


FT 


DOMAIN 


1832 


2351 




PRO-RICH. 


FT 


DOMAIN 


2536 


2547 




GLN-RICH. 


FT 


DOMAIN 


2587 


2703 




PRO-RICH. 


FT 


DOMAIN 


2986 


4000 




GLN-RICH. 


FT 


DOMAIN 


3966 


4085 




PRO-RICH. 


FT 


DOMAIN 


4634 


4702 




PRO- RICH. 


FT 


VARSPLIC 


1 


305 




Missing (in isoform 2). 


FT 










/FTId-VSP_008563. 


FT 


VARSPLIC 


306 


672 




PMEELPAHSWKCKACRVCRACGAGSAELNPNSEWFENYSLC 


FT 










HRCHKAQGGQTIRSVAEQHTPVCSRFSPPEPGDTPTDEPDA 


FT 










LYVACQGQPKGGHVTSMQPKEPGPLQCEAKPLGKAGVQLEP 



FT QLEAPLNEEMPLLPPPEESPLSPPPEESPTSPPPEASRLSP 

FT PPEELPASPLPEALHLSRPLEESPLSPPPEESPLSPPPESS 

FT PFSPLEESPLSPPEESPPSPALETPLSPPPEASPLSPPFEE 

FT SPLSPPPEELPTSPPPEASRLSPPPEESPMSPPPEESPMSP 

FT PPEASRLFPPFEESPLSPPPEESPLSPPPEASRLSPPPEDS 

FT PMSPPPEESPMSPPPEVSRLSPLPWSRLSPPPEESPLS 

FT -> MSPPPEESPMSPPPEASRLFPPFEESPLSPPPEESPLS 

FT PPPEASRLSPPPEDSPMSPPPEESPMSPPPEVSRLSPLPW 

FT SRLSPPPEESPLSPPPEESPTSPPPEASRLSPPPEDSPTSP 

FT PPEDSPASPPPEDSLMSLPLEESPLLPLPEEPQLCPRSEGP 

FT HLSPRPEEPHLSPRPEEPHLSPQ7VEEPHLSPQPEEPCLCAV 

FT PEEPHLSPQAEGPHLSPQPEELHLSPQTEEPHLSPVPEEPC 

FT LSPQPEESHLSPQSEEPCLSPRPEESHLSPELEKPPLSPRP 

FT EKPPEEPGQCPAPEELPLFPPPGEPSLSPLLGEPALSEPGE 

FT PPLSPLPEELPLSPSGEPSLSPQLMPPDPLPPPLSPIITAA 

FT A (in isoform 2) . 

FT /FTId=VSP_008559 . 

FT VARSPLIC 1454 1454 E -> EGET (in isoform 3) . 

FT /FTId=VSP_008560. 

FT VARIANT 4949 4949 R -> H (in dbSNP : 37 82356 ) . 

FT / FTId=VAR__0 17115. 

SQ SEQUENCE 5262 AA; 564171 MW; 26B7C74CAD417E44 CRC64; 

Query Match 59.6%; Score 53; DB 1; Length 5262; 
Best Local Similarity 57.1%; Pred. No. 70; 

Matches 8; Conservative 3; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMF 14 

: I : I I : I II I I 

Db 2191 LHKPPRPQPPEVAF 2204 



RESULT 8 
CASB_CAMDR 

ID CASB_CAMDR STANDARD; PRT; 232 AA. 

AC Q9TVD0; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT lO-OCT-2003 (Rel, 42, Last annotation update) 

DE Beta casein precursor. 

GN CSN2 . 

OS Camelus dromedarius (Dromedary) (Arabian camel) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla ; Tylopoda; Camelidae; Camelus. 

OX NCBI_TaxID=9838 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Somali; TISSUE=Mammary gland; 

RX MEDLINE=98291310; PubMed=9 62 784 0 ; 

RA Kappeler S., Farah Z., Puhan Z.; 

RT "Sequence analysis of Camelus dromedarius milk caseins."; 

RL J. Dairy Res. 65:209-222(1998). 

CC -!- FUNCTION: Important role in determination of the surface 
CC properties of the casein micelles (By similarity) . 

CC SUBCELLULAR LOCATION: Secreted. 

CC TISSUE SPECIFICITY: Mammary gland specific. Secreted in milk. 

CC -!- SIMILARITY: Belongs to the beta-casein family. 



cc 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 



DR 


EMBL; AJ012630; CAA10079 


.1; 






DR 


InterPro ; 


IPR00158 


!8; Casein. 






DR 


Pfam; PF00363; caseins; 


1. 






DR 


PROSITE; 


PS00306; 


CASEIN 


ALPHA BETA; FALSE NEG. 




KW 


Milk; Phosphorylation; Glycoprotein; Signal. 






FT 


SIGNAL 


1 


15 


BY SIMILARITY. 






FT 


CHAIN 


16 


232 


BETA CASEIN. 






FT 


MOD_RES 


30 


30 


PHOSPHORYLATION 


(BY 


SIMILARITY) 


FT 


MOD_RES 


32 


32 


PHOSPHORYLATION 


(BY 


SIMILARITY) 


FT 


MOD_RES 


33 


33 


PHOSPHORYLATION 


(BY 


SIMILARITY) 


FT 


MOD RES 


34 


34 


PHOSPHORYLATION 


(BY 


SIMIL7VRITY) 


SQ 


SEQUENCE 


2 32 AA; 


2621: 


8 MW; A0F9F41D2EA7C518 CRC64; 



Query Match 56.2%; Score 50; DB 1; Length 232; 

Best Local Similarity 60.0%; Pred. No. 6.9; 

Matches 9; Conservative 2; Mismatches 4; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMFP 15 

I : I I I I : I I I I 
Db 160 MYQIPQPVPQTPMIP 174 



RESULT 9 
ZAP3_M0USE 

ID ZAP3_M0USE STANDARD; PRT; 1386 AA. 

AC Q9R0I7; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Nuclear protein ZAP3. 

GN ZAP3 OR ZAP. 

OS Mus mus cuius (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Fetal liver; 

RA Misawa K., Nosaka T., Kitamura T. ; 

RT "A huge nuclear protein rich in proline similar to human hypothetical 

RT protein zap3 and zapll3."; 

RL Submitted (OCT-1999) to the EMBL/GenBank/DDBJ databases. 

CC -!- SUBCELLULAR LOCATION: Nuclear (Potential). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 



cc modified and this statement is not removed. Usage by and for commercial 
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
CC or send an email to license@isb-sib , ch) . 



CC 
DR 
DR 
DR 
KW 
FT 
FT 
FT 
SQ 



EMBL; AB033168; BAA85182.1; 
MGD; MGI: 1926195; Zap3. 
GO; GO: 0005634; C:nucleus; IDA. 
Nuclear protein. 

DOMAIN 15 2 04 PRO-RICH. 

DOMAIN 355 473 GLN-RICH. 

DOMAIN 925 1012 ARG-RICH. 

SEQUENCE 1386 AA; 155130 MW; D8 62F99 1 8ED22 IDF CRC64; 



Query Match 55. 1%; 

Best Local Similarity 57.1%; 
Matches 8; Conservative 

Qy 2 HQPPQPLPPTVMFP 15 

I I I I I I I : I 
Db 81 HLPPPPLPPPPVMP 94 



Score 49; DB 1; 
Pred. No. 58; 
1; Mismatches 



Length 138 6; 
5; Indels 



Gaps 



0; 



RESULT 10 
ZAP3_HUMAN 

ID ZAP3_HUMAN STANDARD; PRT; 1822 AA. 

AC P49750; P49752; Q9P1V7; 

DT Ol-OCT-1996 (Rel. 34, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Nuclear protein ZAP3 (ZAP113). 

GN ZAP 3 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Rowen L., Madan A., Qin S., 7\bbasi N., Baradarani L., Birditt B., 

RA Bloom S., Dors M. , Dickhoff R., Fleetwood P., Harrison G., James R. , 

RA Kaur A., Madan A., Owen M.P., Ratcliffe A., Shaffer T., Hood L.; 

RT "Sequencing of human chromosome 14q24.3 region."; 

RL Submitted (MAY-2000) to the EMBL/GenBank/DDB J databases. 

RN [2] 

RP SEQUENCE OF 539-847 AND 1397-1822 FROM N.A. 

RC TISSUE=Brain; 

RX MEDLINE=95319502; PubMed-75964 06; 

RA Sherrington R. , Rogaev E.I., Liang Y., Rogaeva E.A., Levesque G., 

RA Ikeda M. , Chi H., Lin C, Li G., Holman K. , Tsuda T., Mar L., 

RA Foncin J.-F., Bruni A.C., Montesi M.P., Sorbi S., Rainero I., 

RA Pinessi L., Nee L., Chumakov I., Pollen D., Brookes A., 

RA Sanseau P., Polinsky R.J., Wasco W., da Silva H.A.R., Haines J.L., 

RA Pericak-Vance M.A. , Tanzi R.E., Roses A.D., Eraser P.E., 

RA Rommens J.M., St George-Hyslop P.H.; 

RT "Cloning of a gene bearing missense mutations in early-onset familial 

RT Alzheimer's disease."; 

RL Nature 375:754-7 60(1995). 

CC -!- SUBCELLULAR LOCATION: Nuclear (Potential). 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

DR 
DR 
DR 
KW 



-!- CAUTION: Ref.2 sequence differs from that shown due to a 
frameshift in position 1661. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use^ by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http : //www, isb-sib . ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; AC007956; AAF61275.1; -. 
EMBL; L40403; AAC42008.1; ALT_FRAME. 
EMBL; L40400; AAC42006.1; -. 
Nuclear protein. 



FT DOMAIN 


15 


205 


PRO-RICH. 


FT DOMAIN 


382 


430 


GLN-RICH. 


FT DOMAIN 


807 


1209 


ARG-RICH. 


FT DOMAIN 


1488 


1577 


ARG-RICH. 


FT CONFLICT 


621 


621 


P -> S (IN REF. 2) . 


FT CONFLICT 


1404 


1404 


T -> I (IN REF. 2) . 


FT CONFLICT 


1821 


1821 


K -> E (IN REF. 2) . 


SQ SEQUENCE 


1822 


AA; 204947 


MW; 8E6CB83FE540C7D2 CRC64 ; 


Query Match 




55. 1%; 


Score 49; DB 1; Length 1822; 


Best Local Similarity 57.1%; 


Pred. No. 77; 



Matches 8; Conservative 

Qy 2 HQPPQPLPPTVMFP 15 

I I I I I I I : I 
Db 80 HLPPPPLPPPPVMP 93 



1; Mismatches 



5; Indels 



0; Gaps 



0; 



RESULT 11 
CASB_HUMAN 

ID CASB_HUMAN STANDARD; PRT; 226 AA. 

AC P05814; 

DT Ol-NOV-1988 (Rel. 09, Created) 

DT Ol-AUG-1992 (Rel. 23, Last sequence update) 

DT lO-OCT-2003 (Rel. 42, Last annotation update) 

DE Beta casein precursor. 

GN CSN2 OR CASB. 

OS Homo sapiens (Human) , 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Breast; 

RA Menon R. S. ; 

RL Submitted (OCT-1989) to the EMBL/GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-90353560; PubMed=2387396; 

RA Loennerdal B., Bergstroem S., Andersson Y. , Hjalmarsson K., 
RA , Sundqvist A.K., Hernell 0.; 

RT "Cloning and sequencing of a cDNA encoding human milk beta-casein."; 



RL FEES Lett. 269:153-156(1990). 
RN [3] 

RP SEQUENCE FROM N.A. 

RC TISSUE-Placenta; 

RX MEDLINE-94156198; PubMed=8 112 603 ; 

RA Hansson L., Edlund A. , Johansson T., Hernell 0., Stroemqvist M. , 

RA Lindquist S., Loennerdal B., Bergs troem S.; 

RT "Structure of the human beta-casein encoding gene."; 

RL Gene 139:193-199(1994). 

RN [4] 

RP SEQUENCE FROM N,A. 

RC TISSUE=Placenta; 

RA Kwiatkowski D. J. ; 

RL Submitted (DEC-1997) to the EMBL/GenBank/DDB J databases. 

RN [5] 

RP SEQUENCE OF 161-226 FROM N.A. 

RC TI3SUE=Breast; 

RX MEDLINE-89240053; PubMed=27 174 1 8 ; 

RA Menon R.S., Ham R.G.; 

RT "Human beta-casein: partial cDNA sequence and apparent polymorphism,"; 

RL Nucleic Acids Res. 17:2869-2 869(1989). 

RN [6] 

RP SEQUENCE OF 16-226. 

RX MEDLINE=84185624; PubMed=6715339; 

RA Greenberg R. , Groves M.L., Dower H.J.; 

RT "Human beta-casein. Amino acid sequence and identification of 

RT phosphorylation sites."; 

RL J. Biol. Chem. 259:5132-5138(1984). 

CC -!- FUNCTION: Important role in determination of the surface 
CC properties of the casein micelles. 

CC -!- SUBCELLULAR LOCATION: Secreted, 

CC -!- TISSUE SPECIFICITY: Mammary gland specific. Secreted in milk. 

CC -!- SIMILARITY: Belongs to the beta-casein family. 
CC DATMASE: NAME=Protein Spotlight; 

CC NOTE=Issue 16 of November 2001; 

CC WWW="http : //www. expasy . org/spotlight/articles/sptlt 016 . html" . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http : //www. isb-sib , ch/announce/ 

CC or send an email to license@isb-sib . ch) , 

CC 

DR EMBL; X17070; CAA34916.1; -. 

DR EMBL; X13766; CAA32017.1; -, 

DR EMBL; AF027807; AAC82978.1; -. 

DR EMBL; X55739; CAA39270.1; -. 

DR EMBL; A24287; CAA01728.1; -. 

DR EMBL; A30262; CAA02017.1; -, 

DR PIR; 153730; KBHU. 

DR Genew; HGNC:2447; CSN2 . 

DR MIM; 115460; -. 

DR GO; GO: 0005509; F: calcium ion binding; TAS . 

DR GO; GO:0004857; F:enzyme inhibitor activity; TAS. 

DR GO; GO: 0006816; P: calcium ion transport; TAS. 



DR 


InterPro; 


IPROOIS^ 


i8; Casein, 




DR 


Pfam; PF00363; caseins; 


1. 




DR 


PROSITE; 


PS00306; 


CASEIN ALPHA BETA; 1. 




KW 


Milk; Phosphorylation; 


Glycoprotein; Signal. 




FT 


SIGNAL 


1 


15 






FT 


CflAIN 


16 


226 


BETA CASEIN. 




FT 


MOD RES 


18 


18 


PHO S PHORYLAT I ON . 




FT 


MOD RES 


21 


21 


PHOSPHORYLATION . 




FT 


MOD RES 


23 


23 


PHOSPHORYLATION . 




FT 


MOD RES 


24 


24 


PHOSPHORYLATION . 




FT 


MOD_RES 


25 


25 


PHOSPHORYLATION . 




FT 


CONFLICT 


30 


30 


T -> P (IN REF. 6) 




FT 


CONFLICT 


34 


34 


MISSING (IN REF. 2\ 




FT 


CONFLICT 


48 


50 


EDE -> TDO ^TN RFF 


\j ] . 


FT 


CONFLICT 


120 


120 


S -> Q (IN REF. 6) , 




FT 


CONFLICT 


133 


133 


L -> V (IN REF. 1) . 




FT 


CONFLICT 


140 


140 


H -> Q (IN REF. 1) , 




FT 


CONFLICT 


149 


149 


L -> S (IN REF. 6) . 




FT 


CONFLICT 


173 


173 


Q -> E (IN REF. 6) . 




FT 


CONFLICT 


182 


184 


QW -> EVL (IN REF. 


, 6) . 


FT 


CONFLICT 


188 


188 


Q -> V (IN REF. 6) . 




FT 


CONFLICT 


207 


207 


T -> P (IN REF. 6) . 




FT 


CONFLICT 


214 


222 


TQPLAPVHN -> PEPSTTZABH 


SQ 


SEQUENCE 


226 AA; 


253 


82 MW; 2619C524EA1358E8 


CRC6' 



(IN REF. 6) 



Query Match 52.8%; 
Best Local Similarity 53.3%; 
Matches 8; Conservative 

Qy 1 MHQPPQPLPPTVMFP 15 

11111:11: I 
Db 150 MQQVPQPIPQTLALP 164 



Score 47; DB 1; Length 226; 
Pred. No. 16; 
2; Mismatches 5; Indels 



0; Gaps 



0; 



RESULT 12 
KNIR_DROME 

ID KNIR_DROME STANDARD; PRT; 429 AA. 

AC P10734; Q9VPC6; 

DT Ol-JUL-1989 (Rel. 11, Created) 

DT Ol-JUL-1989 (Rel. 11, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Zygotic gap protein knirps. 

GN KNI OR NROAl OR CG4717. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota ; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI_TaxID=7227 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Oregon-R; TISSUE=Salivary gland; 

RX MEDLINE=89057148; PubMed=2 9 04 12 8 ; 

RA Nauber U., Pankratz M.J., Kilnlin A., Seyffert E., Klemm U., 

RA Jackie H. ; 

RT "Abdominal segmentation of the Drosophila embryo requires a hormone 

RT receptor-like protein encoded by the gap gene knirps."; 

RL Nature 336:489-4 92(1988). 



RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Ber keley ; 

RX MEDLINE=20196006; PubMed=10731132 ; 

RA Adams M.D., Celniker S.E., Holt R.A. , Evans C.A. , Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A. , Galle R.F., 

RA George R.A., Lewis S.E., Richards S., Ashburner M. , Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M. , Pfeiffer B.D., 

RA Wan K.H., Doyle Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J,, Andrews-Pf annkoch C, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L. , Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H., Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z,, Mays A.D., Dew I., Dietz S.M., 

RA Dodson K., Doup L.E., Downes M. , Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W.' 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M. , Glasser K., 

RA Glodek A., Gong F. , Gorrell J.H., Gu Z., Guan P., Harris M., 

RA Harris N.L., Harvey D.A., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A., Rowland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M. , Kalush F., Karpen G.H., Ke Z., Kennison J. A., Ketchum K.A., 

RA Kimmel B.E., Kodira CD., Kraft Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y., Levitsky A.A. , Li J.H., Li Z., Liang Y., Lin X., 

RA Liu X., Mattel B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M. , Murphy B., Murphy L., Muzny D.M., Nelson d!l., 

RA Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., 

RA Palazzolo M. , Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., 

RA Reinert K. , Remington K. , Saunders R.D.C., Scheeler F., Shen H., 

RA Shue B.C., Siden-Kiamos I., Simpson M. , Skupski M.P., Smith T., 

RA Spier E., Spradling A.C., Stapleton M. , Strong R., Sun E., 

RA Svirskas R. , Tector C, Turner R. , Venter E., Wang A.H., Wang X., 

RA Wang Z,~Y., Wassarman D.A., Weinstock G.M,, Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C., Wu D., Yang S., Yao Q.A., 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M. , Zhang G., Zhao Q., Zheng L., 

RA Zheng X.H., Zhong F.N., Zheng W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A. , Myers E.W., Rubin G.M., Venter J.C; 

RT "The genome sequence of Drosophila melanogaster . " ; 

RL Science 287:2185-2195(2000). 

RN [3] 

RP CHARACTERIZATION. 

RX MEDLINE=96312963; PubMed=8670869; 

RA Arnosti D.N., Gray S., Barolo S., Zhou J., Levine M. ; 

RT "The gap protein knirps mediates both quenching and direct repression 

RT in the Drosophila embryo."; 

RL EMBO J. 15:3659-3666(1996). 

CC FUNCTION: TRANSCRIPTIONAL REPRESSOR. BINDS TO MULTIPLE SITES IN 

CC THE EVE STRIPE 3 ENHANCER ELEMENT. PLAYS AN ESSENTIAL ROLE IN THE 

CC SEGMENTATION PROCESS BOTH BY REFINING THE EXPRESSION PATTERNS OF 

CC GAP GENES AND BY ESTABLISHING PAIR-RULES STRIPES OF GENE 

CC EXPRESSION. 

CC -!- SUBCELLULAR LOCATION: Nuclear. 

CC -!- SIMILARITY: Belongs to the nuclear hormone receptor family. NRG 
CC subfamily. 



cc 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use^ by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http : //www. isb-sib , ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X13331; C7^A31709.1; 

DR EMBL; AE003592; AAF51629.2; -. 

DR PIR; S01919; S01919. 

DR HSSP; P03372; IHCP. 

DR TRANSFAC; T00445; 

DR FlyBase; FBgn0001320; kni , 

DR GO; GO: 0004879; F : ligand-dependent nuclear receptor activity; NAS . 

DR GO; GO: 0007088; P : regulation of mitosis; IMP. 

DR InterPro; IPR001628; Znf_C4steroid . 

DR Pfam; PF00105; zf-C4; 1. 

DR PRINTS; PR00047; STROIDFINGER . 

DR ProDom; PD000035; Znf_C4steroid; 1. 

DR SMART; SM00399; ZnF_C4; 1. 

DR PROSITE; PS00031; NUCLEAR_RECEPTOR; 1. 

Receptor; Transcription regulation; DNA-binding; Nuclear protein; 
Zinc-finger; Developmental protein; Repressor, 

FT DNA_BIND 

FT ZN_FING 

FT ZN_FING 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

SQ SEQUENCE 429 AA; 45611 MW; 79CEE8 6A66AB00C7 CRC64; 

Query Match 52.8%; Score 47; DB 1; Length 429; 

Best Local Similarity 57.1%; Pred. No. 32; 

Matches 8; Conservative 2; Mismatches 4; Indels 0; Gaps 0; 

Qy 2 HQPPQPLPPTVMFP 15 

III III ::M 
Db 183 HQSPFQLPPHLLFP 196 



KW 
KW 



5 


71 


NUCLEAR RECEPTOR-TYPE 


5 


25 


C4-TYPE. 


42 


66 


C4-TYPE. 


97 


101 


POLY-ALA. 


137 


142 


POLY-HIS . 


143 


149 


POLY-GLN. 


200 


213 


POLY-ALA. 


375 


382 


POLY-SER. 



RESULT 13 
ATIN_HSV1F 

ID ATIN_HSV1F STANDARD; PRT; 479 AA. 

AC P04486; 

DT 13-AUG-1987 (Rel. 05, Created) 

DT 13-AUG-1987 (Rel. 05, Last sequence update) 

DT Ol-OCT-1996 (Rel. 34, Last annotation update) 

DE Alpha trans-inducing protein (VMW65) (ICP25) (VP16 protein) 

DE (Alpha-TIF) . 

GN UL4 8. 

OS Herpes simplex virus (type 1 / strain F) . 

OC Viruses; dsDNA viruses, no RNA stage; Herpesviridae ; 

OC Alphaherpesvirinae; Simplexvirus . 



ox ■ NCBI_TaxID=10304; 
RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-85298259; PubMed=2 994 050 ; 

RA Pellett P.E., McKnight J.L.C., Jenkins F.J., Roizman B.; 

RT "Nucleotide sequence and predicted amino acid sequence of a protein 

RT encoded in a small herpes simplex virus DNA fragment capable of 

RT trans-inducing alpha genes."; 

RL Proc. Natl. Acad. Sci. U.S.A. 82:5870-5874(1985). 

CC -!- FUNCTION: RESPONSIBLE FOR TRANSCRIPTIONAL ACTIVATION OF IMMEDIATE 

CC EARLY PROMOTERS (ALPHA GENES). ACTIVATION REQUIRES THE FORMATION 

CC OF A HETEROMERIC COMPLEX WITH THE HOST CELL FACTOR. THESE TWO 

CC PROTEINS THEN ASSEMBLE WITH THE OCTAMER MOTIF-BINDING PROTEIN 

CC OCT-1 ON THE CIS-ACTING TARGET SEQUENCE: TAATG7VRAT . 

CC -!- SIMILARITY: TO OTHER HERPESVIRUSES ALPHA TRANS-INDUCING PROTEIN 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to licenset^isb-sib . ch) . 

CC 

DR EMBL; K03350; AAA45766.1; -. 

DR PIR; A03727; IXBElF. 

DR InterPro; IPR003174; Alpha_TIF. 

DR Pfam; PF02232; Alpha_TIF; 1. 

KW Transcription regulation; Trans-acting factor; DNA-binding. 

SQ SEQUENCE 479 AA; 53053 MW; 8DFF24AC1717A1C6 CRC64; 

Query Match 52.8%; Score 47; DB 1; Length 479; 

Best Local Similarity 50.0%; Pred. No. 35; 

Matches 7; Conservative 2; Mismatches 5; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMF 14 

I 111:11 : I 
Db 38 MPSPPMPVPPAALF 51 



RESULT 14 
ATIN_HSV11 

ID ATIN_HSV11 STANDARD; PRT; 490 AA. 

AC P0 64 92; 

DT Ol-JAN-1988 (Rel. 06, Created) 

DT Ol-JAN-1988 (Rel. 06, Last sequence update) 

DT lO-OCT-2003 (Rel. 42, Last annotation update) 

DE Alpha trans-inducing protein (VMW65) (ICP25) {VP16 protein) 

DE (Alpha-TIF) . 

GN UL48. 

OS Herpes simplex virus (type 1 / strain 17) . 

OC Viruses; dsDNA viruses, no RNA stage; Herpes viridae ; 

OC Alphaherpesvirinae; Simplexvirus . 

OX NCBI_TaxID-10299; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=88274327; PubMed-2 83 9594 ; 



RA McGeoch D.J.;^ Dalrymple M.A. , Davison A. J., Dolan A., Frame M.C., 

RA McNab D., Perry Scott J,E., Taylor P.; 

RT "The complete DNA sequence of the long unique region in the genome of 

RT herpes simplex virus type 1."; 

RL J. Gen. Virol. 69:1531-1574(198 8). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=86067203; PubMed=2 9997 07 ; 

RA Dalrymple M.A. , McGeoch D.J., Davison A. J., Preston CM.; 

RT "DNA sequence of the herpes simplex virus type 1 gene whose product 

RT is responsible for transcriptional activation of immediate early 

RT promoters."; 

RL Nucleic Acids Res. 13:78 65-7879(1985). 

RN [3] 

RP DNA-BINDING. 

RX MEDLINE=900 054 39; PubMed=2 67 65 18 ; 

RA Cousens D.J., Greaves R. Coding C.R., 0*Hare P.; 

RT "The C-terminal 79 amino acids of the herpes simplex virus regulatory 

RT protein, Vmw65, efficiently activate transcription in yeast and 

RT mammalian cells in chimeric DNA-binding proteins."; 

RL EMBO J. 8:2337-2342(1989). 

CO -!- FUNCTION: RESPONSIBLE FOR TRANSCRIPTIONAL ACTIVATION OF IMMEDIATE 
CC EARLY PROMOTERS (ALPHA GENES) . ACTIVATION REQUIRES THE FORMATION 

CC OF A HETEROMERIC COMPLEX WITH THE HOST CELL FACTOR. THESE TWO 

CC PROTEINS THEN ASSEMBLE WITH THE OCTAMER MOTIF-BINDING PROTEIN 

CC OCT-1 ON THE CIS-ACTING TARGET SEQUENCE: TAATGARAT . 

CC -!- SIMILARITY: TO OTHER HERPESVIRUSES ALPHA TRANS- INDUCING PROTEIN. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http : //www. isb-sib , ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X14112; CAA32298.1; -. 

DR EMBL; X03141; CAA26913.1; -. 

DR PIR; A24118; IXBE17 . 

DR PDB; 16VP; 26-SEP-Ol. 

DR TRANSFAC; T00894; -. 

DR InterPro; IPR003174; Alpha_TIF. 

DR Pfam; PF02232; Alpha_TIF; 1. 

KW Transcription regulation; Trans-acting factor; DNA-binding; 

KW 3D-structure . 

FT DNA__BIND 411 490 EXPERIMENTALLY DEDUCED. 

FT SITE 442 442 CRITICAL ROLE IN ACTIVATION. 

SQ SEQUENCE 490 AA; 54345 MW; 8DDDEDEDB2A699D3 CRC64; 



Query Match 52.8%; Score 47; DB 1; Length 490; 

Best Local Similarity 50.0%; Pred. No. 36; 

Matches 7; Conservative 2; Mismatches 5; Indels 0; Gaps 0; 



Qy 1 MHQPPQPLPPTVMF 14 

I I I I : I I : I 
Db 4 9 MPSPPMPVPPAALF 62 



RESULT 15 
ATIN__HSV2H 

ID ATIN_HSV2H STANDARD; PRT; 490 AA. 

AC P23990; P29793; 

DT Ol-APR-1993 (Rel. 2b, Created) 

DT Ol-NOV-1995 (Rel. 32, Last sequence update) 

DT Ol-NOV-1997 (Rel. 35, Last annotation update) 

DE Alpha trans-inducing protein (VMW65) (ICP25) (VP16 protein) 

DE (Alpha-TIF) . 

GN VP16 OR UL48. 

OS Herpes simplex virus (type 2 / strain HG52), and 

OS Herpes simplex virus (type 2 / strain 333) . 

OC Viruses; dsDNA viruses, no F(NA stage; Herpesviridae; 

OC Alphaherpesvirinae; Simplexvirus . 

OX NCBI_TaxID-10315, 10313; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=HG52; 

RX MEDLINE=91365250; PubMed=1653757 ; 

RA Cress A., Triezenberg S.J.; 

RT "Nucleotide and deduced amino acid sequences of the gene encoding 

RT virion protein 16 of herpes simplex virus type 2."; 

RL Gene 103:235-2 38(1991). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=HG52; 

RA Dolan A. ; 

RL Submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=333; 

RX MEDLINE=92046332; PubMed=1658370; 

RA Greaves R.F., 0*Hare P.; 

RT "Sequence, function, and regulation of the Vmw65 gene of herpes 

RT simplex virus type 2."; 

RL J. Virol. 65:6705-6713(1991). 

CC -!- FUNCTION: RESPONSIBLE FOR TRANSCRIPTIONAL ACTIVATION OF IMMEDIATE 
CC EARLY PROMOTERS (ALPHA GENES) . 

CC -!- SIMILARITY: TO OTHER HERPESVIRUSES ALPHA TRANS-INDUCING PROTEIN. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http : //www. isb-sib , ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; M60050; AAA45863.1; 

DR EMBL; Z86099; CAB06734.1; 

DR EMBL; M75098; AAA45862.1; 

DR InterPro; IPR003174; Alpha_TIF. 

DR Pfam; PF02232; Alpha_TIF; 1. 

KW Transcription regulation; Trans-acting factor; DNA-binding. 

FT SITE 443 443 CRITICAL ROLE IN ACTIVATION. 

FT CONFLICT 12 12 A -> R (IN REF. 1). 



SQ SEQUENCE 490 AA; 54620 MW; 2E9FDA8D0D8BC17 4 CRC64; 



Query Match 52.8%; Score 47; DB 1; Length 490; 

Best Local Similarity 50.0%; Pred. No. 36; 

Matches 7; Conservative 2; Mismatches 5; Indels 0; Gaps 0; 

Qy 1 MHQPPQPLPPTVMF 14 

I I I I : I I : I 
Db 47 MPSPPMPVPPAALF 60 



Search completed: August 24, 2004, 15:43:47 
Job time : 10.0597 sees 



